AI-generated Songs Are Getting Longer, Not Necessarily Better

The latest version of Stability AI’s Stable Audio audio generator now lets users create three-minute-long songs.

By Emilia David, a reporter who covers AI. Prior to joining The Verge, she covered the intersection between technology, finance, and the economy.

Illustration by Alex Castro / The Verge

Stable Audio 2.0, an audio generation model for Stability AI, now lets users upload their own audio samples that they can then transform using prompts and create AI-generated songs. But the songs will not win any Grammys just yet.

The first version of Stable Audio was released in September 2023 and only offered up to 90 seconds for some paying users, which meant they could only make short sound clips to experiment with. Stable Audio 2.0 offers a full three-minute sound clip — the length of most radio-friendly songs. All uploaded audio must be copyright-free.

Unlike OpenAI’s audio generation model, Voice Engine, which is only available to a select group of users, Stability AI made Stable Audio free and publicly available through its website and, soon, its API.

One big difference between Stable Audio 2.0 and its earlier iteration is the ability to create songs that sound like songs, complete with an intro, progression, and an outro, says Stability AI.

The company let me play a bit with Stable Audio to see how it works, and let’s just say there is still a long way to go before I can channel my inner Beyoncé. With the prompt “folk pop song with American vibes” (I meant Americana, by the way), Stable Audio generated a song that, in some parts, does sound like it belongs in my Mountain Vibes Listening Wednesday Morning Spotify playlist. But it also added what I guess are vocals? Another Verge reporter claims it sounds like whale sounds. I’m more worried I have accidentally summoned an entity into my home.

Here’s the song:

I theoretically could tweak the audio to make it more my listening style, as new features in Stable Audio 2.0 let users customize their project by adjusting prompt strength (aka how much the prompt should be followed) and how much of any uploaded audio it will modify. Users can also add sound effects like the roar of a crowd or keyboard taps.

Strange Gregorian whale noises aside, it’s not a surprise that AI-generated songs still feel soulless and weird. My colleague Wes Davis ruminated on this after listening to a song generated by Suno. Other companies, like Meta and Google, have also been dabbling in AI audio generation but have not released their models publicly as they gather feedback from developers to respond to the soulless sound problem.

Stability AI said in a press release that Stable Audio is trained on data from AudioSparx, which has a library of more than 800,000 audio files. Stability AI maintains that artists under AudioSparx were allowed to opt out of their material to train the model. Training on copyrighted audio was one of the reasons Stability AI’s former vice president for audio, Ed Newton-Rex, left the company shortly after the launch of Stable Audio. For this version, Stability AI says it partnered with Audible Magic to use its content recognition technology to track and block copyrighted material from entering the platform.

Stable Audio 2.0 is better than its previous version at making songs sound like songs, but it’s not quite there yet. If the model insists on adding some sort of vocals, maybe the next version will have more discernible language.

https://www.theverge.com/rss/index.xml

Emilia David

April 3, 2024 12:00 pm

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

The latest version of Stability AI’s Stable Audio audio generator now lets users create three-minute-long songs.

Leave a ReplyCancel reply

Joe is effective BECAUSE he’s old!

Piers Morgan COOKS UNHINGED Cenk Uygur on TRANS In Women’s Sports

‘Big news’: Joe Biden takes the lead in Fox News poll

Lefties losing It: Rita Panahi reacts to woman with ‘striking resemblance’ to Tucker Carlson

‘I don’t speak dementia’: Sky News host mocks Joe Biden’s latest incoherent speech

The third Knowles baby has entered the world!

Larry Elder DISMANTLES Michael Rapaport On SUPPORTING Donald Trump

Our political class in a nutshell

The latest version of Stability AI’s Stable Audio audio generator now lets users create three-minute-long songs.

You May Like This --

Leave a ReplyCancel reply