Meet Fugatto — an impressive new AI sound model from Nvidia

Nvidia Fugatto
(Image credit: Flux/Nigel Powell/AI image)

Graphics and AI giant NVIDIA has announced a new AI model called Fugatto (short for Foundational Generative Audio Transformer Opus 1). Developed by an international team of researchers. It is being billed as "the world’s most flexible sound machine" taking on ElevenLabs and AI music maker Suno in one hit.

With this model, we’re about to witness a completely new paradigm in how sound and audio are manipulated and transformed by AI. It goes way beyond converting text to speech or producing music from text prompts and delivers some genuinely innovative features we haven’t seen before.

Fugatto isn't currently available to try as it's only a research paper but it is likely it will be made available to one or more Nvidia partners in the future and then we will start to see some significant changes in how sound is developed.

How does Nvidia Fugatto work?

Key to Nvidia Fugatto is its ability to exhibit emergent capabilities, which the team is calling ComposableART. This means it can do things it was not trained to do, by combining different capabilities together in new ways.

The authors of the launch research paper describe how the model can produce a cello that shouts with anger, or a saxophone that barks. It may sound silly but some of the demonstrations seen on the project’s homepage are very impressive.

For example, the ability to instantly convert speech into different accents and emotional intensity, or adding and deleting instruments seamlessly to and from an existing music performance.

We have seen some of this from other models such as OpenAI's Advanced Voice, ElevenLabs SFX model or Google's MusicFX experiment, but not in one model.

What can Nvidia Fugatto be used for?

Audio AI Fugatto Generates Sound from Text | NVIDIA Research - YouTube Audio AI Fugatto Generates Sound from Text | NVIDIA Research - YouTube
Watch On

One of the most striking examples the team puts forward is the on-the-fly generation of complex sound effects, some of which are completely new or wacky.

Video game developers and those in the movie industry will either be salivating or sweating at the news that almost any kind of soundscape will soon be AI-generated at the touch of a button.

The power of all this technology is delivered via a model that features 2.5 billion parameters and was trained on a huge suite of Nvidia computer processors, as you might expect.

As with many of these early research demonstrations, it’s likely to be a while before we see a fully fledged product released onto the market. Creating a four-second audio clip of a thunderstorm or a mechanical monster is one thing, making it usable in the real world is another.

However, there’s no question that the technology behind this new model shows that an important bridge has been crossed in the ability of the machine to master another art form. It may be the first time we’ve seen AI generational power of this type, but it’s certainly not going to be the last.

More from Tom's Guide

Category
Arrow
Arrow
Back to MacBook Air
Brand
Arrow
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Storage Type
Arrow
Condition
Arrow
Price
Arrow
Any Price
Showing 10 of 98 deals
Filters
Arrow
Show more
Nigel Powell
Tech Journalist

Nigel Powell is an author, columnist, and consultant with over 30 years of experience in the technology industry. He produced the weekly Don't Panic technology column in the Sunday Times newspaper for 16 years and is the author of the Sunday Times book of Computer Answers, published by Harper Collins. He has been a technology pundit on Sky Television's Global Village program and a regular contributor to BBC Radio Five's Men's Hour.

He has an Honours degree in law (LLB) and a Master's Degree in Business Administration (MBA), and his work has made him an expert in all things software, AI, security, privacy, mobile, and other tech innovations. Nigel currently lives in West London and enjoys spending time meditating and listening to music.

Read more
Yue AI logo
I've been using this free AI song maker to create tracks — and the quality is surprisingly good
Hume AI on an iPhone screen
Hume AI just unveiled Octave — new AI voice generator is eerily human
Hume AI on an iPhone screen
Hume AI launches OCTAVE — suddenly everything can get a voice
OmniHuman screenshot of AI generated video
TikTok parent company just launched stunning AI video generator — OmniHuman-1 is taking the world by storm
NVIDIA AI NIM microservices and Blueprints running on RTX hardware
The Future of AI is Being Built Today, Accelerated by GeForce RTX 50 Series GPUs on RTX AI PCs
NitroDiffusion
AI image model lets you create pictures in real-time — on your laptop
Latest in AI
A mother and daughter happily browse the internet
7 AI hacks every mom needs to stop feeling exhausted all the time
Woman using ChatGPT app on the beach
I just tested ChatGPT-4.5 vs ChatGPT-4o with 7 prompts — here's my verdict
ChatGPT on iPhone
5 mind-blowing ChatGPT prompts you’ll wish you knew sooner
Smiling man in a kitchen preparing a vegan meal
Doritos were invented at Disneyland? I asked Google Gemini Deep Research for snack facts— my mind is blown
Sam Altman
ChatGPT-4.5 delayed in surprise announcement — and it could launch with a controversial new payment model
AI Mode of google search
Google launches 'AI Mode' for search — here's how to try it now
Latest in Features
Nothing Phone 3a Pro vs Pixel 8a.
I shot over 200 photos with the Nothing Phone 3a Pro vs Pixel 8a — here’s the winner
Man in a low squat with arms extended at shoulder height during gorilla walk exercise
Mobility coach shares 'squat like a baby' routine to boost lower-body flexibility and improve hip health
The Silent Beacon Bluetooth panic button worn on a wrist next to a Fitbit
I tried a physical panic button for 48 hours — and this tiny device already makes me feel safer
A woman with long dark hair sits up in bed with her arms stretched in the air as sunlight streams in through her open curtains
You lose an hour of sleep this weekend, here’s why that’s a good thing
the dyson airwrap ID in teal and terracotta colorway (patina and orange) with a lapis case, with a brush, hairfryer, curling wand attachments
I’ve ruined my hair with 3 years of perms — but the new Dyson Airwrap ID saved my locks
Apple Watch Ultra 2 on a black silicone strap and Amazfit T-Rex 3 on an orange silicone strap shown side-by-side on a user's wrist
I walked 10K steps with Apple Watch Ultra 2 vs Amazfit T-Rex 3 — and this one was more accurate