Efficient and high-quality text-to-audio generation with Latent Consistency Model.
Year: 2023
Website: https://audiolcm.github.io/
Input types: Text
Output types: Audio
Output length: Variable
AI Technique: Latent Diffusion
Dataset: "Teacher" model not disclosed, AudioCaps dataset, (Kim et al., 2019) for AudioLCM mode
License type: MIT
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
The project's GitHub repository contains instructions on how to use pre-trained models as well as on how to prepare a dataset and train a model from scratch: https://github.com/liuhuadai/AudioLCM