I used ChatGPT Advanced Voice to time travel — and I'm still shocked at how good it is
It speaks Latin
OpenAI’s GPT-4o Advanced Voice is one of the most powerful and potentially important artificial intelligence tools of the year. It lets you have a human-like natural conversation with an AI voice and even interrupt it when it's speaking too much.
Currently only available to a small number of ChatGPT Plus subscribers, this new method of interacting with technology is expected to be widely available this fall. The company also plans to launch a vision mode that allows you to see the world through your camera next year.
What makes Advanced Voice different from the current ChatGPT Voice or even the newly launched Gemini Live is the fact it is speech-to-speech. This means it can natively understand what you say, how you say it and the emotional intonations behind your words.
It can also do accents and tell a great story, so I asked Advanced Voice to take me on a time travel adventure. It started with a trip to Ancient Egypt and spoke in the voice of a trader. Not only did it do a great job of the voices, but it is a fun storyteller.
Prompting the adventure with Advanced Voice
Using advanced voice isn't that different to any other artificial intelligence technology in that it starts with a prompt. Unlike talking to ChatGPT with text or generating an image with Midjourney, Advanced Voice is prompted by your voice.
At the most basic level, this is simply telling it what you want it to do but it can also pick up on tone changes in your voice so if you ask it to explain the meaning of life but do so sounding slightly teary or upset it will respond in a way that reflects how are you sounded.
For this adventure I played it straight, simply starting by asking Advanced Voice: “Now, we're going to go through a story. Imagine you're a time traveler. When in history would you go?”
It suggested the World's Fair in Chicago in the 19th century. I asked it to take on the role of a time traveler but also to talk as people at the fair. After a brief sojourn to Chicago, I asked "Let's go somewhere else. Push the button and take me to a new location." We went to ancient Egypt.
Advanced Voice said: “Picture this: the grand pyramids are being built, and the Nile flows as the lifeblood of a thriving civilization. What are you most curious about in this time and place?”
This is where I asked it about the language, including speaking the words as accurately as possible based on what we know.
We then went to a market and finally on to Rome and a conversation between our Egyptian trader and a Roman citizen, one speaking Egyptian, the other Latin. I even had Advanced Voice use a Yoda voice for a small portion of the adventure and it gave it a good try.
Final thoughts
Advanced Voice is a brilliant storyteller, able to change emotion levels, reflect the intensity of different scenarios and even take on different accents and voices.
The problem I have with it is the limitations imposed by OpenAI. It 'could' generate sound effects to enhance a scene but its been stopped from doing so. In theory it could even adapt its voice even more than it does, but again it has been stopped.
The issue is an understandable one: safety. Asking the model to perform those more unpredictable tasks could lead to output that breaks OpenAI's safety guidelines and potentially push Advanced Voice into the realm of unsafe to release. It's just frustrating knowing those capabilities are slightly beyond reach.
Even without them though Advanced Voice is still the best interaction I've had with AI, allowing for real-time conversation, a natural flow where I can interrupt on a whim and someone to talk to that responds as a human might to my tone and speed.
More from Tom's Guide
- OpenAI shares a new GPT-4o advanced voice demo — it can teach you a language
- ChatGPT Advanced Voice is out — 9 examples showing why you should be excited
- ChatGPT-4o Advanced Voice features — OpenAI just revealed when they’re coming
Sign up to get the BEST of Tom's Guide direct to your inbox.
Get instant access to breaking news, the hottest reviews, great deals and helpful tips.
Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?