OpenAI to make GPT-4o Advanced Voice available by the end of the month to select group of users

(Image credit: Future)

OpenAI CEO Sam Altman says the first users will start to get access to GPT-4o Advanced Voice in the next couple of weeks, but this will be a limited "alpha" rollout.

The company is testing the full capabilities of GPT-4o, a new type of Omni model released during its Spring Update in May. Unlike GPT-4, this natively multimodal model can understand speech directly without converting it into text.

This makes GPT-4o both faster and significantly more accurate when acting in the role of voice assistant, even allowing it to pick up on tone and vocal intonations during a conversation.

Users have been waiting patiently for access, but OpenAI says safety testing must be completed first. Some have briefly gained access, and there have been multiple demos of its capabilities, but most users won’t get it until later this year.

What is GPT4o Advanced Voice

alpha starts later this month, GA will come a bit afterJuly 18, 2024

Say hello to GPT-4o - YouTube

Watch On

GPT-4o Advanced Voice is an entirely new type of voice assistant, similar to but larger than the recently unveiled French model Moshi, which argued with me over a story.

In demos of the model, we’ve seen GPT-4o Advanced Voice create custom character voices, generate sound effects while telling a story and even act as a live translator.

This native speech ability is a significant step in creating more natural AI assistants. In the future, it will also come with live vision abilities, allowing the AI to see what you see.

Other use cases for Advance Voice include having it act as a very patient language teacher, able to correct you directly on pronunciation and help improve your accent.

“ChatGPT’s advanced Voice Mode can understand and respond with emotions and non-verbal cues, moving us closer to real-time, natural conversations with AI. Our mission is to bring these new experiences to you thoughtfully,” OpenAI said in a statement last month.

Why the delay in launching GPT-4o Advanced Voice?

Character voices with GPT-4o voice - YouTube

Watch On

OpenAI is one of the most cautious artificial intelligence labs, taking significant time to security test, verify and put guardrails in place for any new major model.

Altman has also called for regulation of frontier-style models like the upcoming GPT-5 or world models like Sora due to the risk they present to society. This caution has allowed other companies to begin to catch up with OpenAI, and GPT-4 is no longer the only top-tier model.

The company was concerned that GPT-4o Advanced Voice, without appropriate guardrails, could offer potentially harmful information or be used unexpectedly. To tackle this, they’re gradually releasing it to trusted users first and then more widely over time.

“As part of our iterative deployment strategy, we'll start the alpha with a small group of users to gather feedback and expand based on what we learn,” a spokesperson explained.

“We are planning for all Plus users to have access in the fall. Exact timelines depend on meeting our high safety and reliability bar. We are also working on rolling out the new video and screen sharing capabilities we demoed separately, and will keep you posted on that timeline.”

More from Tom's Guide

Back to MacBook Air

Apple

Asus

Lenovo

Intel Core M3

Intel Pentium

128GB

256GB

512GB

1TB

Black

Grey

Silver

New

Refurbished

EMMC

SSD

Showing 10 of 31 deals

Filters☰

(15-inch 256GB)

Asus Zenbook S 13 OLED

(OLED)

$1,399.99

View

Lenovo IdeaPad Duet 3

$369.99

View

Asus ROG Zephyrus G14 2023

Apple MacBook Pro 14-inch M3 (2023)

(1TB Black)

Our Review

☆☆☆☆☆

$2,399

$1,998.98

View

Apple MacBook Pro 14-inch M3 (2023)

(512GB Black)

Our Review

☆☆☆☆☆

$1,999

$1,699

View

Asus Zenbook S 13 OLED

(OLED)

$1,599

View

Lenovo IdeaPad Duet 3

(128GB 8GB RAM)

$379.99

View

Asus ROG Zephyrus G14 2023

$3,299.99

View

TOPICS

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?

1 Comment Comment from the forums

333brando

Fake news... select group of people is probably just going to be dev team, while everyone in the public just complains that they didn't get chosen to receive nothing. The harsh reality and poor business decision is, they announced a product too far in advance, that was not even close to ready for public release. All of that ikn order to grasp for relevance because Google was overshadowing them with their announcements. Instead of getting people to hop aboard a hype train, the masses saw it for what it was-- a contrived piece of advertising to win over subscribers and maintain relevance. When did corporate programming become so overtly obvious in their deception?
Reply

What is GPT4o Advanced Voice

Sign up to get the BEST of Tom's Guide direct to your inbox.

Why the delay in launching GPT-4o Advanced Voice?

More from Tom's Guide