Stability AI Unveils its AI-Driven Audio Platform

Stability AI just introduced a novel AI platform called Stable Audio. Stable Audio is primarily for text-to-audio generation. The company trained Stable Audio with audio data rather than images. This platform empowers users to create music or background audio for various projects.

Thrilled that demand for our Stable Audio launch today has been off the charts! But our servers are now at full capacity, so you may not be able to access the product. If you can't, we kindly ask that you check back in 24 hours to try again.

In the meantime, we're working hard…
— Stability AI (@StabilityAI) September 13, 2023

Traditional audio diffusion models typically produce fixed-length audio outputs. This poses challenges for music production, as songs can vary significantly in duration. In response, Stability AI has enhanced its model. It lets users of Stable Audio generate audio of varying lengths. To achieve this, the company incorporated music-related training data. It introduced text metadata specifying the beginning and end of a song.

Previously, a model trained on a 30-second audio clip could only generate 30 seconds of audio, limiting its ability to create seamless sections of songs. With these model adjustments, Stability AI provided Stable Audio users more flexibility to determine the desired duration of their songs.

The company trained Stable Audio with a dataset of more than 800,000 audio files. These audio files include songs, sound effects, & individual instrument tracks. It also used text metadata sourced from AudioSparx. AudioSparx is a song licensing company. This extensive dataset accounts for audio content of more than 19,500 hours. Stability AI’s collaboration with AudioSparx permitted them to utilize copyrighted materials.

Prices

Stable Audio has three pricing tiers. It has a demo version that allows users to generate audio clips of about 45 seconds. The demo version gives users access to 20 monthly tracks. Its Professional level costs $11.99. This plan enables the creation of 500 tracks with durations reaching 90 seconds each. Lastly, it has an Enterprise subscription.

Enterprise subscription allows companies to tailor their usage and pricing to their specific needs. It’s important to note that users of the free version cannot use the audio they create with Stable Audio for commercial purposes.

The featured image is from decrypt.com