StabilityAI drops Stable Audio 2.0 — here’s everything that’s new

(Image credit: StabilityAI)

StabilityAI has unveiled the second iteration of its artificial intelligence music generation tool, offering longer tracks, audio-to-audio support, and a greater commitment to protecting the copyright of creators.

Stable Audio 2.0 allows users to create three-minute tracks at 44.1 kHz stereo by inputting a natural language processing prompt such as “A beautiful piano arpeggio grows to a full beautiful orchestral piece”, “Lo-fi funk” or “drum solo”. The AI-generated tracks include structured compositions like an intro, development, outro, and stereo sound effects.

Another new feature offered by Stable Audio 2.0 includes the ability to generate “fully produced samples” by uploading an audio file to the platform, evolving from solely a text-to-audio tool. For example, mimicking a drum sound with your voice would prompt the app to create an audio clip of a drum playing.

Taking copyright seriously

When using the new audio-to-audio feature, users must refrain from uploading copyrighted material under StabillityAI’s terms of conditions. It uses content recognition technology to ensure compliance with this policy and preventing any copyright infringement.

As with Stable Audio 1.0, the second model is also trained on AudioSparx’s vast audio file library of 800,000 music, sound effects, single-instrument stems, and text-based metadata. For AudioSparx musicians unhappy with the idea of their works being used for AI model training, they had the opportunity to opt out.

I’ve resigned from my role leading the Audio team at Stability AI, because I don’t agree with the company’s opinion that training generative AI models on copyrighted works is ‘fair use’.First off, I want to say that there are lots of people at Stability who are deeply…November 15, 2023

These reinforced copyright infringement and creator opt-out policies follow the recent departure of former VP of audio, Ed Newton-Rex. He announced his resignation in November 2023 with an X post that heavily criticized the company’s approach to upholding creator’s rights.

“I’ve resigned from my role leading the Audio team at StabilityAI, because I don’t agree with the company’s opinion that training generative AI models on copyrighted works is ‘fair use’,” he wrote.

He concluded his post by urging creators to voice their concerns to ensure tech companies “realise that exploiting creators can’t be the long-term solution in generative AI.”

Under the hood

In addition to longer tracks and audio-to-audio support, Stable Audio 2.0 sports a beefed-up architecture that facilitates the “generation of full tracks with coherent structures.” Adapting every component of the system has resulted in “improved performance over long time scale,” they claimed.

The tool features a new type of compressed autoencoder that creates shorter audio representations by compressing raw audio waveforms. Meanwhile, a diffusion transformer - similar to the one that powers Stable Diffusion 3 - can manipulate longer sequence data.

“The combination of these two elements results in a model capable of recognizing and reproducing the large-scale structures that are essential for high-quality musical compositions,” wrote Stability AI in a blog post.

The tool is free to use and available immediately.

More from Tom's Guide

Back to MacBook Air

Apple

Asus

Lenovo

Intel Core M3

Intel Pentium

8GB RAM

16GB RAM

128GB

512GB

1TB

Black

Grey

Silver

New

Refurbished

EMMC

SSD

Showing 10 of 37 deals

Filters☰

Apple MacBook Air M3

$849

View

Lenovo IdeaPad Duet 3

(128GB 8GB RAM)

$379.99

View

Asus Zenbook S 13 OLED

(13.3-inch 512GB)

$1,524.99

$1,189.99

View

Asus ROG Zephyrus G14 2023

$1,599.99

View

Lenovo IdeaPad Duet 3

$369.99

View

Asus Zenbook S 13 OLED

(OLED)

$1,399.99

View

Apple MacBook Pro 14-inch M3 (2023)

(1TB Intel Core M3)

Our Review

☆☆☆☆☆

$2,399

$1,998.98

View

Apple MacBook Pro 14-inch M3 (2023)

(512GB Black)

Our Review

☆☆☆☆☆

Asus ROG Zephyrus G14 2023

$3,299.99

View

Nicholas Fearn is a freelance technology journalist and copywriter from the Welsh valleys. His work has appeared in publications such as the FT, the Independent, the Daily Telegraph, The Next Web, T3, Android Central, Computer Weekly, and many others. He also happens to be a diehard Mariah Carey fan!

Taking copyright seriously

Sign up to get the BEST of Tom's Guide direct to your inbox.

Under the hood

More from Tom's Guide