AI’s capability to generate realistic text and images is well known, but it can also produce lifelike sounds. In recent years, AI-driven technology for creating music, voice, and sound effects has made significant strides. Though this may not yet pose a threat to professionals in the industry, it offers a range of practical solutions for those needing background sounds or voiceovers for their projects.
So here’s my roundup of five that have impressed me most with their capabilities to create realistic-sounding voices and audio effects or even catchy pop songs. And if none of them quite fit your needs, there’s a roundup of the best of the rest, too.
Stable Audio 2.0
Stable Audio 2.0, developed by stability.ai, one of the original developers of the Stable Diffusion image generation model, features text-to-audio as well as audio-to-audio. This means it can create a song or tune based on an uploaded sample, as well as a natural language prompt. Tracks can be up to three minutes long. Importantly, it was entirely trained on licensed data from the AudioSparx music library, meaning original creators are compensated for their work. The generative model is based on a latent diffusion algorithm, which works in a similar way to diffusion-based image generation, and tracks created on the platform can be freely used for commercial purposes.
PROMOTED
Mubert
Mubert is effectively an all-in-one generative AI-powered music production studio. It can generate tracks up to 25 minutes in length from a single natural language prompt, with users given a choice of genres, instruments, moods and styles of music. An extension and plugin system means it can be integrated with popular industry-standard video editing tools like After Effects and Premier, and the Mubert Studio platform lets you work on music collaboratively with others. There are various licensing packages available that allow you to use the tunes you create in commercial projects; however, uploading to music streaming services like Spotify is not currently allowed.
Elevenlabs
Elevenlabs is a sophisticated text-to-voice generator created by former Google and Palantir engineers that creates spoken-word audio. Simply type in the text you want to hear, select one of the pre-set voices and hear your words brought to life. What makes it particularly impressive is the amount of emotional intonation that can be applied to the output, creating very natural, human-sounding dialogue. In fact the technology is so good that it has been adopted by publisher HarperCollins to create audiobooks in different languages.
MORE FOR YOU
Apple’s iPhone AI Plans Confirmed With New Software Release
Packers Complete Safety Overhaul With Georgia’s Javon Bullard
Sopranos Star On James Gandolfini s Caring Reaction To Her MS Diagnosis
Synthesia
Synthesia is a great all-round generative AI tool that I also mentioned in my roundup of my favorite video genAIs. But it also works very well for creating voices, so it makes this list, too. With a library of over 130 voices to choose from, it can quickly translate your audio into numerous languages – you can even manually adjust the pronunciation of individual words if you don’t like the way they sound by default. This makes it great for creating voiceover tracks for any kind of video or even automating the creation of podcasts, trailers, audiobooks or any other kind of spoken content you might need.
Forbes Daily: Join over 1 million Forbes Daily subscribers and get our best stories, exclusive reporting and essential analysis of the day’s news in your inbox every weekday.Get the latest news on special offers, product updates and content suggestions from Forbes and its affiliates.Sign Up
By signing up, you agree to our Terms of Service, and you acknowledge our Privacy Statement. Forbes is protected by reCAPTCHA, and the Google Privacy Policy and Terms of Service apply.
Suno
Suno is a lot of fun! It creates songs about anything you want, complete with lyrics, from a simple text prompt. You can tell it to create the song in whatever genre you want and either supply the lyrics yourself or let the generative algorithms write them for you. The singing voices sound very natural and human. It works on a credit system, with free tier users able to create songs up to 1 minute and 20 seconds in length and expand them with additional credits by purchasing a subscription to one of the premium tiers. Users of the paid-for service are granted permission to monetize the content they create or use it for commercial purposes.
00:02
03:12
Read More
Other Great Generative AI Music And Sound Tools
There are lots of these out there! Most of them can be tried for free, so dive in and see if there’s something that suits your needs.
AudioCraft is an open-source sound generation model created by Meta. It’s not currently available as a web service, and installation and some technical know-how are required to get it running. You can play with a demo of some of its features here, though.
Generative AI-powered songwriting assistant allows users to pay per finished track.
Great for those wanting to use AI to develop complex and emotional music pieces that sound like they were created by human composers.
Compose short songs from text prompts, with lyrics generated by GPT-3.
Generate background music for online content (or any other type of music) in multiple styles and edit with simple AI tools.
Create songs in seconds with a simple interface and a strong user community.
Convert blog posts into audio experiences.
Text-to-voice featuring your favorite (or least favorite) celebrities!
AI voice platform with a number of tools, including text-to-speech, AI voice generation, and AI cover songs.
Create a clone of your own voice and hear it sing any song.
Create personalized audio stories.
This is a fully-featured cloud mixing and recording platform with AI functionality baked into the mastering process.
AI tool that lets you extract elements such as vocals or instrument tracks from existing audio and video.
Music platform for creating AI-generated, royalty-free tracks with AI-assisted recommendations.
AI voice studio with realistic and customizable text-to-speech.
Generative music creation from Google, powered by the search giant’s MusicLM model.
Podcastle
Podcasting tool with a number of genAI functions, including text-to-voice and noise removal.
Generate unique tunes with the help of AI at the click of a button.
Text-to-speech tool for creating natural-sounding synthetic voice.
Create custom music tracks in many different styles and moods for royalty-free use.
AI music generation with various licensing options available for using your tracks commercially.