It is made up of three AI models that each focus on a different aspect of sound production. Music is created by MusicGen using text inputs. This model was trained using “20,000 hours of music owned by Meta or specifically licensed for this purpose.” AudioGen was trained on common sound effects and produces audio from written prompts that simulates canine barking or footstep sounds.
Users can now produce sounds with fewer artifacts thanks to an updated version of Meta’s EnCodec decoder, which is what happens when you over-process audio.
The business provided some AudioCraft sample audio for media to hear. The whistling, sirens, and humming noises that were produced sounded rather natural. The songs’ guitar strings felt authentic, yet they also had an unnatural quality to them.
Meta is merely the most recent company to attempt fusing music with AI.
A big language model developed by Google called MusicLM, which is only available to researchers, created minutes’ worth of sounds based on text cues. Then, a song that was “AI-generated” and had Drake and The Weeknd’s voices on it went popular before being pulled down. Recently, several musicians, including Grimes, have urged listeners to contribute their voices to AI-created compositions.
EDM and festivals like Ultra aren’t new; musicians have been experimenting with electronic audio for a very long time. However, music created by computers frequently sounds like it has been edited from audio. The sounds of AudioCraft and other generative AI-generated music are made just using text and a sizable sound data bank.
Instead of being the upcoming huge pop smash, AudioCraft currently sounds more like stock music that may be utilized for ambience or elevator music. But Meta is certain that its innovative design may bring in a new generation of tunes in the same way that synthesizers did when they first became widely used.
According to the company’s blog, “We think MusicGen can turn into a new type of instrument — just like synthesizers when they first appeared.” Since audio frequently comprises millions of points where the model performs an action as opposed to written text models like Llama 2, which only contain thousands, Meta acknowledged the difficulties in developing AI models capable of producing music.
The business claims that open sourcing is necessary for AudioCraft in order to diversity the training data.
“We acknowledge the lack of diversity in the datasets utilized to train our models. Particularly, the music dataset employed exclusively includes audio-text pairs with text and metadata in English and has a higher proportion of Western-style music, according to Meta.
“By sharing the code for AudioCraft, we hope that other researchers can more easily test new approaches to limit or eliminate potential bias in and misuse of generative models.”
Because many are concerned that AI models will use copyrighted content in their training and because record labels and musicians have historically been a litigious group, they have already raised the alarm about the risks of AI.
We all know what happened to Napster, to be sure, but more recently, Spotify was the target of a $1 billion lawsuit based on a statute that dates back to the period of player pianos, and just this year, a judge had to decide if Ed Sheeran had plagiarized Marvin Gaye for “Thinking Out Loud.”
Before Meta’s “synthesizer” goes on tour, however, someone will need to come up with a strategy that attracts listeners who are interested in machine-made music rather than just muzak.