I tried the new ElevenLabs Video to Sound Effects demo — and it's pretty amazing
Bring life to AI video
Eleven Labs has done it again. The pioneer in top quality AI generated voice and SFX audio, has just unveiled a new text to sound effects API.
To celebrate the occasion the company also released a very cool open source demo called Video to Sound Effects to showcase what the tech can do. It’s available online and at Github, and it’s pretty awesome.
Just take your generated video, upload it to the ElevenLabs demo webpage, and wait while the platform analyzes the video, and returns a choice of four different sound effect audio tracks to choose from.
Select the version you like and hit the download button to grab the video clip along with the new audio. Super simple. The whole process takes around 5 minutes from uploading a 5 second clip.
This is a new area of AI known as video-to-audio (V2A). Google recently announced a research project promising similar technology but that isn't yet available to try.
Putting ElevenLabs to the test
I tested it out using Luna Dream Machine (LDM) as my video generation tool. I tried five different video prompts with mixed results, but hey, it’s early days. Anyhoo, I eventually succeeded in getting a clip of a gorilla riding a Harley Davison motorbike, and uploaded it to the ElevenLabs demo page.
Within 20 seconds or so I had four audio samples to audition, chose one and started the download process. I have to say that despite some dodgy iterations the final result is actually pretty great. The video is hilarious, and the audio gives it a whole new dimension.
The tech works by sampling 4 frames at 1 second intervals from the uploaded video, which is sent to ChatGPT-4o to create a custom text-to-sound-effects prompt.
The prompt is then sent back to the ElevenLabs API to create the final SFX. It’s crude, but surprisingly effective. The results will never win an Oscar, or indeed a Golden Reels award, but as a quick and dirty way to give some life to a dull AI generated video clip, it works well.
We are excited to introduce the Text to Sound Effects API. To showcase it - we've built the first Video to Sounds Effects app. This app is available for free online and fully open-source. pic.twitter.com/8aalo8GCSoJune 17, 2024
While the demo is clearly aimed at the general public, the new API is aimed at serious business use.
The company is not only targeting sound effects with the tech, but also on-demand samples for music production, and dynamic sound for video games.
To deploy the API, customers will need an ElevenLabs account with an API key, and every generation will cost 100 characters, or 25 characters per second for set durations.
More from Tom's Guide
- Apple is bringing iPhone Mirroring to macOS Sequoia — here’s what we know
- iOS 18 supported devices: Here are all the compatible iPhones
- Apple Intelligence unveiled — all the new AI features coming to iOS 18, iPadOS 18 and macOS Sequoia
Sign up to get the BEST of Tom's Guide direct to your inbox.
Here at Tom’s Guide our expert editors are committed to bringing you the best news, reviews and guides to help you stay informed and ahead of the curve!
Nigel Powell is an author, columnist, and consultant with over 30 years of experience in the technology industry. He produced the weekly Don't Panic technology column in the Sunday Times newspaper for 16 years and is the author of the Sunday Times book of Computer Answers, published by Harper Collins. He has been a technology pundit on Sky Television's Global Village program and a regular contributor to BBC Radio Five's Men's Hour.
He has an Honours degree in law (LLB) and a Master's Degree in Business Administration (MBA), and his work has made him an expert in all things software, AI, security, privacy, mobile, and other tech innovations. Nigel currently lives in West London and enjoys spending time meditating and listening to music.