A song from any sound — Udio just got a major AI music update and I put it to the test with 7 prompts
Turn your squeaky door into an instrument
Leading artificial intelligence music platform Udio has just been given a major upgrade, adding the ability to use your own music or even household sounds as part of the song prompt.
The sound prompt feature is the latest in a series of upgrades announced in what feels like a tit-for-tat battle between it and main rival Suno to one-up each other in features. This follows Suno’s limited rollout of sound-to-music and also includes longer tracks.
With audio upload in Udio you can use it as the starting sound of a song, or as part of the inspiration for how the track should sound. “Audio uploads greatly enrich your prompting vocabulary,” the AI startup said on X.
I put it to the test with a mixture of random sounds from around my home and even my own terrible guitar playing and singing. You can try it for yourself on the Udio website.
What can you do with audio upload in Udio?
Today we’re announcing a set of updates, starting with a new experimental feature for paid subscribers, audio uploads. You can upload an audio clip of your choice, and extend this clip either forward or backward by 32 seconds using up to 2 minutes of context.Audio uploads… pic.twitter.com/3X62kTJGijJune 5, 2024
“You can use audio to set tempo and mood, and explore from there,” Udio says. “Maybe you’ve got a great intro but don’t know where to go next, or a full mix that’s missing the perfect bridge–in both cases, Udio can provide inspiration.”
I’ve tried it a few different ways and seem to get the best results if it is in the middle of a song rather than using it at the start. You can upload up to two minutes but shorter clips work better if you want the AI to do most of the work.
It is an experimental feature and is only available to paying subscribers. While the clip can be two minutes, it can only move the song forward or backward by 32 seconds.
Sign up to get the BEST of Tom's Guide direct to your inbox.
Here at Tom’s Guide our expert editors are committed to bringing you the best news, reviews and guides to help you stay informed and ahead of the curve!
Sometimes it seems to completely mask your original sound, especially if its an instrument, and others it will keep it in full. It did this when I tried a toy drumstick noise, wrapping the song around the annoying sound rather than incorporating it into the beat.
7 tests to see how well it works
1. The toy drumstick
For Christmas last year my son got me a pair of toy drumsticks. They are air drums where you wave them and it makes a sound like hitting a snare or hi-hat. I recorded a 10-second sample of the sound on my phone and used that as a “middle track” moment.
The prompt, alongside the audio track as you can’t just give it a sound, was “Living a life to the full R&B.” I had it create an intro to run before the sound, then extended with a section after.
While it did work in the song — with the AI generating a hip-hop track with lo-fi vocals — for some reason it kept the entire drum solo in full as a standalone highlight. I tried again and it kept the solo but incorporated the sound into the beat for the second half.
2. The simple guitar riff
I then played a simple guitar riff on my slightly out-of-tune acoustic guitar with a capo on the second fret. This was just a couple of bar chords in a generic pattern. I included the text prompt “Misery loves company” with no genre or style — I left that up to the AI.
This sounded awful — like someone ordered a classic early 90s grunge singer off Temu and had them perform an acoustic set. So I tried again, this time specifying the genre as alternative rock and it gave me 30 Seconds to Mars but from Wish.com.
3. The nursery rhyme-ish
This time I recorded myself saying a slightly twisted version of a classic nursery rhyme. I gave it the prompt “never again” with the genre UK Drill and asked it to create an intro so my nursery rhyme had to be incorporated into the track, not just put at the start.
That failed miserably, and not in an “it worked but was terrible” sense. It didn’t work at all. The AI-generated random noise. So I tried again but removed the genre and had it just add a section to the sound. No matter what I tried it would not incorporate the sound.
Every attempt simply resulted in it being appended to a great AI-generated track, and it sounded like someone had taped over a recording using a Talkboy from the 90s.
4. The coffee machine
After the failure of my own voice to make music I decided to try something completely different, turning to a device in my home office I couldn’t manage without — the espresso machine. I have a simple Breville Bijou that lets me make a quick expresso or a latte when I like.
I gave it the full sound, which included the grinding and pouring, as well as the text prompt “All I need is coffee, hip hop”. I had it create a section before the sound, then used the exten feature to add another after the sound. I then tried again making it the intro sound.
The first version was ok and made a clever use of the sound, treating as a beat running a rap over the top but overall it was terrible. That was as good as it got.
5. My terrible singing
Next up I picked up the guitar again and recorded a riff with a short vocal over the top. I uploaded this with “make me happy” as the text prompt and “drum n bass” as the genre.
My guitar is still slightly out of tune and I cropped the sound to focus just on the music and exclude the sounds of me picking up the guitar and putting it back down — I didn’t exclude those moments in the last guitar test.
This was the best yet. It not only mirrored the riff but seemed to somehow clone my voice as the singer it generated sounded as bad as I did. It also created a fun little song.
6. The microwave goes ping
This one I used the “food is ready” ping of my microwave. I left all settings on default including where to place the sound and gave it the prompt “Never ready” with no genre defined.
It worked so much better than I expected. The two tracks it created from my simple prompt were alternative rock and for some reason it added extra pings to the microwave, but it seemed to work. Would be great for the music video.
I tried it again but this time asking it to put the pings in the middle of the song, but otherwise leaving all other settings the same. It created a prog rock track and put the ping at the end, which didn’t really work.
7. The creepy old door
One of the doors in my house needs its hinges oiling, so I took advantage of this and recorded the sound in the hope of using it in a halloween track. I gave Udio the prompt “Gothic rock; haunted mansion.”
This one impressed me and seemed to show off Udio’s ability to match to the tone and tempo of the sound you use in the prompt. It created a perfect halloween inspired track, opening on the creepy old door (my house is 200 years old) then going into a pretty good track.
More from Tom's Guide
- I got early access to LTX Studio to make AI short films
- I just tried the new Assistive AI video tool — and its realism is incredible
- Meet LTX Studio — I just saw the future of AI video tools that can help create full-length movies
Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?