Name:
Description: Efficient and high-quality text-to-audio generation with Latent Consistency Model.
Year:
Website:
Input types: Audio MIDI Text None Genre Metadata Image
Output types: Audio MIDI
Output length:
Technology: Not Specified Latent Consistency Model Latent Diffusion LSTM VAE Sequence-to-sequence neural network Transformer Suite of AI tools Diffusion Hierarchical Recurrent Neural Network (RNN) Autoregressive Convolutional Neural Network
Dataset:
License type:
Has real time inference: Yes No Not known
Is free: Yes No Yes and No, depending on the plan Not known
Is open source: Yes No Not known
Are checkpoints available: Yes No Not known
Can finetune: Yes No Not known
Can train from scratch: Yes No Not known
Tags: text-to-audio MIDI text-prompt small-dataset open-source low-resource free checkpoints proprietary no-input image-to-audio
Guide: The project's GitHub repository contains instructions on how to use pre-trained models as well as on how to prepare a dataset and train a model from scratch: [https://github.com/liuhuadai/AudioLCM](https://github.com/liuhuadai/AudioLCM) This field renders Markdown
Captcha: