Rhythm generator using Variational Autoencoder (VAE). Based on M4L.RhythmVAE by Nao Tokui, modded and extended to support simple and compound meter rhythms, with minimal amount of training data. Similarly to RhythmVAE, the goal of R-VAE is the exploration of latent spaces of musical rhythms. Unlike most previous work in rhythm modeling, R-VAE can be trained with small datasets, enabling rapid customization and exploration by individual users. R-VAE employs a data representation that encodes simple and compound meter rhythms. Models and latent space visualizations for R-VAE are available on the project's GitHub page: https://github.com/vigliensoni/R-VAE-models.
Year: 2022
Website: https://github.com/vigliensoni/R-VAE
Input types: MIDI
Output types: MIDI
Output length: 2 bars
AI Technique: VAE
Dataset: "The Future Sample Pack"
License type: GPLv3
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
A spectral approach to audio analysis and generation with neural networks (LSTM). The techniques included here were used as part of the Mezzanine Vs. MAGNet project featured as part of the Barbican's AI: More than Human exhibition It represents ongoing work from researchers at The Creative Computing Institute, UAL and Goldsmiths, University of London. MAGNet trains on the magnitude spectra of acoustic audio signals, and reproduces entirely new magnitude spectra that can be turned back in to sound using phase reconstruction - it's very high quality in terms of audio fidelity. This repo provides a chance for people to train their own models with their own source audio and genreate new sounds. Both given projects are designed to be simple to understand and easy to run.
Year: 2019
Website: https://github.com/Louismac/MAGNet
Input types: Audio
Output types: Audio
Output length: Input length
AI Technique: LSTM
Dataset: N/A
License type: BSD 3-Clause
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
RAVE is an audio processing/generativity based on deep learning. RAVE (Realtime Audio Variational autoEncoder) is a learning framework for generating a neural network model from audio data. RAVE allowing both fast and high-quality audio waveform synthesis (20x real-time at 48 kHz sampling rate on standard CPU). In Max and Pd, it is accompanied by its nn~ decoder, which enables these models to be used in real time for various applications, audio generativity/timbre transformation/transfer.
Year: 2022
Website: https://forum.ircam.fr/collections/detail/rave/
Input types: Audio
Output types: Audio
Output length: Variable / Audio buffer size
AI Technique: VAE
Dataset: N/A
License type: MIT
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Generates a 4 bar phrase with no input necessary. It is possible to control the number of variations and temperature. The model can be helpful for breaking a creative block or as a source of inspiration for an original sample. Under the hood it uses MusicVAE. You can learn more about it here: https://magenta.tensorflow.org/music-vae. Ready to use as a Max for Live device. If you want to train the model on your own data or try different pre-trained models provided by the Magenta team, refer to the instructions on the team's GitHub page: https://github.com/magenta/magenta/tree/main/magenta/models/music_vae
Year: 2018
Website: https://magenta.tensorflow.org/studio#generate
Input types: Audio
Output types: Audio
Output length: 4 bars
AI Technique: Not Specified
Dataset: "Millions of melodies and rhythms", including NSynth Dataset, MAESTRO dataset, Lakh MIDI dataset
License type: Apache 2.0
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
WIP
Year: 2016
Website: https://github.com/soroushmehr/sampleRNN_ICLR2017
Input types: Audio
Output types: Audio
Output length: Variable
AI Technique: Hierarchical Recurrent Neural Network (RNN)
Dataset: Not disclosed
License type: MIT
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch:
Mustango is an open-source Text-to-Music model with focus on fine controllability allowing to specify musical attributes such as key or chord sequences.
Year: 2023
Website: https://amaai-lab.github.io/mustango/
Input types: Text
Output types: Audio
Output length: 10 sec
AI Technique: Latent Diffusion
Dataset: MusicBench
License type: MIT/CC-BY-SA
Real time:
Free:
Open source:
Checkpoints:
Fine-tune:
Train from scratch: