Music generation from text descriptions based on stable diffusion. Can be conditioned on an image.
Year: 2022
Website: https://www.riffusion.com/
Input types: Text Image
Output types: Audio
Output length: around 3min
AI Technique: Diffusion
Dataset: Not disclosed
License type: MIT
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch: