MusicGen - Music Generator

Jun 25, 2023#AI #Text2Music #META934

AI Translation

This post is translated from Chinese into English through AI.View Original

AI-generated summary

MusicGen is an AI music generator based on the Transformer model. It can convert text descriptions into 12-second audio clips. It uses Meta's encodec audio tokenizer to break down the audio data into smaller parts and predicts the next part of the music segment, similar to how a language model predicts the next letter in a phrase. It can handle both text and music cues simultaneously and has fast and efficient single-stage processing capabilities. To deploy MusicGen, you need to install the required packages and download the pre-trained models. The platform includes an official demo and an online testing option. This article serves as a tool sharing record and is synchronized with HBlog.

MusicGen is an AI music generator based on the Transformer model, which can transform text descriptions into 12-second audio.

Features#

Using the Meta encodec audio tokenizer, the audio data is broken down into smaller parts, and then the next part of the music segment is predicted, similar to a language model predicting the next letter in a phrase. It can handle both text and music prompts simultaneously, with fast and efficient single-stage processing capabilities.

Deployment#

Install the project

pip install 'torch>=2.0'
git clone https://github.com/facebookresearch/audiocraft.git
cd audiocraft
pip install -e .  # or if you cloned the repo locally

Download pre-trained models

small: 300M model, text to music only

medium: 1.5B model, text to music only

melody: 1.5B model, text to music and text+melody to music

large: 3.3B model, text to music only

Run MusicGen
```
python app.py
```

Platform#

Official demo
Online testing

Disclaimer#

This article is only for sharing tools.

This article is synchronized with HBlog.