Google’s New AI Puts Us One Step Closer to Star Trek’s Universal Translator

Google is developing a new AI speech-to-speech translator technology that could imitate your voice in real time. Its name is Translatotron, and it may be key to achieving the seamless universal translation dream we see on Star Trek within our lifespan.

Credit: StarTrek.com/CBS Entertainment

(Image credit: StarTrek.com/CBS Entertainment)

The new tech skips the entire speech-to-text step of current translation technologies. Right now, you could speak into your microphone, get your speech recognized, and your phone would output the translated text to the screen with the option of reading it out loud in a generic synthetic voice.

The new Translatotron aims to do two things. First, eliminate the speech-to-text step and thus avoiding text-to-speech synthesis, going directly to a speech-to-speech model. And then, get rid of the generic voice and replace it with your own voice. While not perfect, the examples in the Google Research github page are pretty good (check out the “Predictions with voice transfer” column in the second “Conversational Spanish-to-English” section).

The paper — published on ArXiv by Google’s research scientists Ye Jia, Ron Weiss, Fadi Biadsy, Wolfgang Macherey, Melvin Johnson, Zhifeng Chen, and Yonghui Wu — describes that the team used a neural network to analyze the original speech spectrograms into target spectrograms in another language, reproducing the original voice.

The researchers acknowledge that the result is not perfect — yet. They are getting there, as this first research was to demonstrate the feasibility of this model: “The proposed model slightly underperforms a baseline cascade of a direct speech-to-text translation model and a text-to-speech synthesis model, demonstrating the feasibility of the approach on this very challenging task.”

The technology opens a path to a Star Trek-like future in which you would speak and, automagically, people will hear you actually speaking in their own language. Perhaps then humans will be able to understand each other, and get past one of the barriers that separate societies from one another. The end of Babel is nigh!

TOPICS
Jesus Diaz

Jesus Diaz founded the new Sploid for Gawker Media after seven years working at Gizmodo, where he helmed the lost-in-a-bar iPhone 4 story and wrote old angry man rants, among other things. He's a creative director, screenwriter, and producer at The Magic Sauce, and currently writes for Fast Company and Tom's Guide.

Latest in Google Gemini
A stock photo of a person on their phone looking at a spreadsheet while several graphs are displayed on the laptop in front of them.
Google Sheets just got an AI upgrade that analyzes your data and visualizes it
Gemini logo shown on a phone's screen
Google Gemini can now analyze and summarize documents for free — here's how
Gemini Live
Gemini Live major upgrade just revealed by Google
Gemini 2
Google Gemini 2.0 is now free for users — here’s how to access it now
Gemini 2
My browser tabs were getting out of hand so I let Gemini 2.0 takeover — here's how it went
Claude vs Gemini
I tested Gemini vs Claude with 7 prompts to find the best AI chatbot — here's the winner
Latest in News
iOS 19 logo on an iPhone
iOS 19 — all the rumors so far
NYTimes Connections
NYT Connections today hints and answers — Tuesday, March 11 (#639)
An image of a CAPTCHA
Hackers are using reCAPTCHA to trick users into infecting their own PCs with malware — how to stay safe
Gmail logo on iPhone
Gmail just got a huge AI upgrade that will save you a ton of time
Nina Oyama and Kate Box in Deadloch
One of my favorite shows on Prime Video has been totally overlooked — and it's got 100% on Rotten Tomatoes
Xbox handheld
Xbox handheld reportedly arriving this year, new PC-like console in 2027