A collection of AI-powered generative tools: sample generator (Deep Sampler), drum sampler specialising in drum sound generator (Emergent Drums), sample pack generator (Infinite Packs), drum sound variation generator (Humanize).
Year: 2021
Website: https://audialab.com/
Input types: Text
Output types: Audio
Output length: Short
AI Technique: Not Specified
Dataset: Not disclosed
License type: Proprietary
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Efficient and high-quality text-to-audio generation with Latent Consistency Model.
Year: 2023
Website: https://audiolcm.github.io/
Input types: Text
Output types: Audio
Output length: Variable
AI Technique: Latent Diffusion
Dataset: "Teacher" model not disclosed, AudioCaps dataset, (Kim et al., 2019) for AudioLCM mode
License type: MIT
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
The first in a suite of generative audio tools for producers and musicians to be released by Harmonai. The provided Jupyter notebooks allow users to perform: - Unconditional random audio sample generation - Audio sample regeneration/style transfer using a single audio file or recording - Audio interpolation between two audio files
Year: 2022
Website: https://github.com/Harmonai-org/sample-generator
Input types: Audio Text
Output types: Audio
Output length: Variable
AI Technique: Latent Diffusion
Dataset: Online sources - glitch.cool, songaday.world, MAESTRO dataset, Unlocked Recordings, xeno-canto.org
License type: MIT
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Loudly Generator lets you select genre and other options to generate music or use a text prompt. The new beta version provides an option of including your own audio clips.
Year: 2023
Website: https://www.loudly.com/
Input types: Text Metadata
Output types: Audio
Output length: 7 min
AI Technique: Not Specified
Dataset: Not disclosed
License type: Proprietary
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Platform for generating tracks, loops, mixes and jingles of length between 5 seconds to 5 minutes. The desired output can be prompted by a text or conditioned on an image (uploaded or linked to), or on additional information like genre, mood or activities. These labels can be chosen from a long list of predefined options.
Year: 2016
Website: https://mubert.com/
Input types: Text Genre Metadata Image
Output types: Audio
Output length: 5s-5min
AI Technique: Not Specified
Dataset: Not disclosed
License type: Proprietary
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Language Model for conditional music generation developed by Meta. The output can be prompted by a text description and additionally conditioned on a melody.
Year: 2023
Website: https://ai.honu.io/papers/musicgen/
Input types: Text
Output types: Audio
Output length: 30s
AI Technique: Transformer
Dataset: NSynth Dataset; Others not disclosed
License type: MIT/CC-BY-NC
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Music generation from text descriptions developed by Google. The output can be prompted by a text description and additionally conditioned on a melody.
Year: 2023
Website: https://google-research.github.io/seanet/musiclm/examples/
Input types: Text
Output types: Audio
Output length: 30s
AI Technique: Transformer
Dataset: MusicCaps, AudioSet
License type: Proprietary
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Open source platform capable of generating music from text prompts.
Year: 2023
Website: https://okio.ai/
Input types: Audio Text
Output types: Audio
Output length: Variable
AI Technique: Suite of AI tools
Dataset: Not disclosed
License type: MIT for core tools
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Music generation from text descriptions based on stable diffusion. Can be conditioned on an image.
Year: 2022
Website: https://www.riffusion.com/
Input types: Text Image
Output types: Audio
Output length: around 3min
AI Technique: Diffusion
Dataset: Not disclosed
License type: MIT
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
An online platform for music generation based on a predefined genre or style. Users can select a format suitable for a specific type of content (eg. social media, gaming, vlogs), or type of output (eg. loops, sfx).
Year: 2023
Website: https://soundful.com/
Input types: Text Metadata
Output types: Audio MIDI
Output length: 2.5min
AI Technique: Not Specified
Dataset: Not disclosed
License type: Proprietary
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Splash Pro is a platform for music generation from text descriptions. You can specify desired BPM and key. The platform contains a text-to-vocals model for synthesising realistic vocals. The output cn be downloaded in an MP3 format. From Splash's website: "We have been developing our own proprietary technology and high quality audio datasets since 2017. Our AI research and capabilities include Text-to-Singing, Text-to-Rap, Generative Text-to-Music, Composition, Melody, Voice Transfer, Lyrics and Mastering."
Year: 2023
Website: https://www.splashmusic.com/
Input types: Text
Output types: Audio
Output length: 30s-3min
AI Technique: Not Specified
Dataset: Data collected and owned by Splash as well as data freely available under Creative Commons license
License type: Proprietary
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Open source text-to-audio model for generating samples and sound effects from text descriptions. The model enables audio variations and style transfer of audio samples. The creators claim it is ideal for creating drum beats, instrument riffs, ambient sounds, foley recordings and other audio samples for music production and sound design. Generates stereo audio at 44.1kHz.
Year: 2023
Website: https://stability.ai/news/introducing-stable-audio-open
Input types: Audio Text
Output types: Audio
Output length: 47s
AI Technique: Diffusion
Dataset: freesound.org, freemusicarchive.org
License type:
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Full song generation from a text description. Can generate songs with lyrics. The generated songs can be then remixed or extended.
Year: 2023
Website: https://suno.com
Input types: Text
Output types: Audio
Output length: around 2-4min
AI Technique: Not Specified
Dataset: Not disclosed
License type: Proprietary
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Full song generation using text prompts. The songs can contain lyrics, prompted separately. The model can only be accessed through a proprietary platform which also offers clip and lyrics editing tools.
Year: 2024
Website: https://www.udio.com/
Input types: Audio Text
Output types: Audio
Output length: 30s
AI Technique: Not Specified
Dataset: Not disclosed
License type: Proprietary
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
TwoShot's Coproducer is an all-in-one AI assistant that helps creators produce high-quality, commercially-safe audio. The platform enables users to: * Generate full tracks from hummed melodies or simple text prompts. * Remix existing songs, split audio into stems, and create unique samples. * Automatically score video scenes with context-aware sound effects. Designed for both beginners and professionals, The coproducer integrates seamlessly with industry-standard DAWs (e.g., Ableton, Logic) and is built on a 100% ethically-sourced, rights-cleared foundation.
Year: 2025
Website: https://twoshot.ai/coproducer
Input types: Audio MIDI Text Genre Metadata
Output types: Audio MIDI
Output length:
AI Technique: Suite of AI tools
Dataset: Proprietary licenced dataset
License type: Depends on inputs/models. Possible to generate royalty free content in many ways
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch: