ElevenLabs drops new conversational AI — it’s as natural as chatting to a human
I made one to improve my math skills
Voice is the future of human-computer interaction. I've said this several times recently and AI voice company ElevenLabs has a new product that further highlights the power of conversation in getting things done.
ElevenLabs Conversational AI system is a voice bot, setup to feel like you're making a phone call and holding a conversation with it is just like calling a human.
It is fully customizable, letting you select, design or even clone the voice it uses. You can also add your own knowledge base. For example, if you're making a math tutor you could include access to SAT prep guides.
The most useful aspect is being able to set the underlying brain, or language model. You can pick between any OpenAI, Google or Anthropic model or even include your own custom model if you're running a company.
How does Conversational AI work
Conversational AI is here.Build AI agents that can speak in minutes with low latency, full configurability, and seamless scalability. pic.twitter.com/JqBlwVczdXDecember 3, 2024
Unlike ChatGPT Advanced Voice this is not native speech-to-speech. It works like Gemini Live or MetaAI voice — you speak, it turns it to text and sends that to the AI. The AI responds in text and ElevenLabs voices it up using its existing voice models. This happens so fast it may as well be speech-to-speech.
To make this work ElevenLabs engineers had to create a new custom speech-to-text model that could transcribe the user's words fast enough that it wasn't noticeable, it then had to ensure it all worked seamlessly together.
With Conversational AI, ElevenLabs is directly competing with OpenAI's Realtime API offering. These are model systems designed to make it easier for a company or organization to offer voice-based interaction with products. This could be in a call center fielding phone calls or something less obvious like learning products.
Sign up now to get the best Black Friday deals!
Discover the hottest deals, best product picks and the latest tech news from our experts at Tom’s Guide.
One example use case could be in a children's toy, where the model is trained to offer support and feedback in an age-appropriate way.
Creating a voice assistant
Anyone with an ElevenLabs account can create a conversational agent. It comes with four default templates that can be fully customized.
One is a support agent called Eric designed to resolve issues, another is Matilda the math tutor and a third is a travel guide called George with information on most places around the world. The fourth is a video game wizard with a mysterious voice.
You can also create them from scratch and I tried it with a life coach given access to commonly used coaching tools such as habit tracking and goal setting. It uses Gemini 1.5 flash for speed and price reasons.
Making a call to the agent costs 500 credits per minute during development. The starter plan gives you 30,000 credits for $4 per month.
Overall it is a simple process to set up. There is a lot of flexibility in how you build it and your agents will appear in the sidebar of your ElevenLabs account. You can also import Twilio phone numbers and hook it up to your voice assistant.
For fun, I created a customer support agent named Ryan that uses a clone of my own voice. I'm going to see if my Dad notices when I give it a phone number and tell him it's my new work number and to call if he needs tech help.
More from Tom's Guide
- ElevenLabs is free for the first two months in Black Friday Deal — create AI sound for less
- OpenAI's AI video generator Sora was leaked in 'artist protest' — here's what we know
- Claude Desktop can now browse the internet and manage files on your computer — here's what's new
Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?