AI Music Generation - Model Explorer

Name:

Description: DiffRhythm is an open-source music generation model. The user can input lyrics and style information as text, along with an optional audio prompt for context, with the input audio clips needing to be less than 10 seconds long. Supported languages include English and Chinese. DiffRhythm will output its generated music as an audio file (MP3, wav, or ogg) that is up to 285 seconds in length.

Year:

Website:

Input types:

Output types:

Output length:

Technology:

Dataset:

License type:

Has real time inference:

Is free:

Is open source:

Are checkpoints available:

Can finetune:

Can train from scratch:

Tags:

Guide: A demo is available [here](https://huggingface.co/spaces/ASLP-lab/DiffRhythm). To run it locally on MacOS, Windows, or Linux, the instructions are given in the [GitHub Repository](https://github.com/ASLP-lab/DiffRhythm). Note that DiffRhythm requires at least 8 GB of VRAM. This field renders Markdown

Captcha: captcha

Edit DiffRhythm