Google’s New AI Puts Us One Step Closer to Star Trek’s Universal Translator

Google is developing a new AI speech-to-speech translator technology that could imitate your voice in real time. Its name is Translatotron, and it may be key to achieving the seamless universal translation dream we see on Star Trek within our lifespan.

Credit: StarTrek.com/CBS Entertainment

(Image credit: StarTrek.com/CBS Entertainment)

The new tech skips the entire speech-to-text step of current translation technologies. Right now, you could speak into your microphone, get your speech recognized, and your phone would output the translated text to the screen with the option of reading it out loud in a generic synthetic voice.

The new Translatotron aims to do two things. First, eliminate the speech-to-text step and thus avoiding text-to-speech synthesis, going directly to a speech-to-speech model. And then, get rid of the generic voice and replace it with your own voice. While not perfect, the examples in the Google Research github page are pretty good (check out the “Predictions with voice transfer” column in the second “Conversational Spanish-to-English” section).

The paper — published on ArXiv by Google’s research scientists Ye Jia, Ron Weiss, Fadi Biadsy, Wolfgang Macherey, Melvin Johnson, Zhifeng Chen, and Yonghui Wu — describes that the team used a neural network to analyze the original speech spectrograms into target spectrograms in another language, reproducing the original voice.

The researchers acknowledge that the result is not perfect — yet. They are getting there, as this first research was to demonstrate the feasibility of this model: “The proposed model slightly underperforms a baseline cascade of a direct speech-to-text translation model and a text-to-speech synthesis model, demonstrating the feasibility of the approach on this very challenging task.”

The technology opens a path to a Star Trek-like future in which you would speak and, automagically, people will hear you actually speaking in their own language. Perhaps then humans will be able to understand each other, and get past one of the barriers that separate societies from one another. The end of Babel is nigh!

TOPICS
Jesus Diaz

Jesus Diaz founded the new Sploid for Gawker Media after seven years working at Gizmodo, where he helmed the lost-in-a-bar iPhone 4 story and wrote old angry man rants, among other things. He's a creative director, screenwriter, and producer at The Magic Sauce, and currently writes for Fast Company and Tom's Guide.

Latest in Google Gemini
Google Gemini logo
You can now use Google Gemini without an account — here's how to get started
A stock photo of a person on their phone looking at a spreadsheet while several graphs are displayed on the laptop in front of them.
Google Sheets just got an AI upgrade that analyzes your data and visualizes it
Gemini logo shown on a phone's screen
Google Gemini can now analyze and summarize documents for free — here's how
Gemini Live
Gemini Live major upgrade just revealed by Google
Gemini 2
Google Gemini 2.0 is now free for users — here’s how to access it now
Gemini 2
My browser tabs were getting out of hand so I let Gemini 2.0 takeover — here's how it went
Latest in News
Bill Gates in 2019
Bill Gates just predicted the death of every job thanks to AI — except for these three
NYTimes Connections
NYT Connections today hints and answers — Wednesday, March 26 (#654)
Gemini screenshot image
Google unveils Gemini 2.5 — claims AI breakthrough with enhanced reasoning and multimodal power
Samsung Galaxy Z Flip 6 review.
Samsung Galaxy Z Flip 7 design just teased in new cases leak — and the outer display is huge
Google Chrome
Chrome failed to install on Windows PCs, but Google has issued a fix — here's what happened
nyc spring day AI image
OpenAI just unveiled enhanced image generator within ChatGPT-4o — here's what you can do now