Now ChatGPT has a body — startup puts OpenAI tech in a robot
Handing you apples and stacking your dishes is no challenge for the humanoid Figure 01
You may want to take a seat before reading this one, maybe you can also ask ChatGPT to hand you a glass of water too while you’re at it.
A relatively new AI startup just put OpenAI's artificial intelligence into the body of a robot and the result is pretty much what you’d expect it to be (minus the chaos and destruction if you’re more of a glass of water half-empty kind of person).
This new tech is being developed by Figure, an AI robotics company worth $2.6 billion that’s partnered with OpenAI. Its latest innovation is Figure 01, a robot which the company demoed in an impressive video.
Images and speech are contextualized
Judging solely by the acting skills it’s hard to tell who’s the real human, but we’re assuming that Figure 01 is the shiny-looking figure that’s doing all the work.
Text prompts are already becoming a thing of the past as Figure 01 is capable of having a real-time voice conversation with you — and it sounds exactly like conversations with the OpenAI ChatGPT Voice option in the app.
Images are captured from onboard cameras to provide the robot with a visual context so that when the human opposite it mentions he’s hungry, Figure 01 identifies an apple within reach and hands it over. We go from “Can I have something to eat?” to apple delivered successfully to the human hand in around 10 seconds.
Holding a complex conversation
We are now having full conversations with Figure 01, thanks to our partnership with OpenAI.Our robot can:- describe its visual experience- plan future actions- reflect on its memory- explain its reasoning verballyTechnical deep-dive 🧵:pic.twitter.com/6QRzfkbxZYMarch 13, 2024
As with our discussions with ChatGPT, Figure 01 can handle equally complex conversations. It can describe what it’s seeing, plan future actions, reflect on its memory, and explain its reasoning verbally.
Sign up to get the BEST of Tom's Guide direct to your inbox.
Here at Tom’s Guide our expert editors are committed to bringing you the best news, reviews and guides to help you stay informed and ahead of the curve!
Behind the scenes, the robot’s cameras are capturing images which are then contextualized. Microphones are picking up speech which is then transcribed to text and fed into a large multimodal model trained by OpenAI that’s capable of understanding both images and text.
So when Figure 01 was asked why it handed over the apple it promptly replied, “I gave you the apple because it’s the only edible item I could provide you with from the table.”
Humans have had an interesting history with apples. They led to quite some trouble in the Garden of Eden but then they inspired Isaac Newton to develop his gravitational theory.
Since Figure 01 can put things into context maybe we should ask it which kind of scenario we should prepare for. If we're meddling with forbidden fruit or if we're on the cusp of a new era of science and technology.
More from Tom's Guide
- OpenAI is about to give robots a brain to enhance 'robotic perception, reasoning and interaction'
- LG announces new AI robot for CES 2024 that can monitor your smart home, watch your pets and even boost your mood
- Google's DeepMind is using AI to teach robots household chores — here's the result
Christoph Schwaiger is a journalist who mainly covers technology, science, and current affairs. His stories have appeared in Tom's Guide, New Scientist, Live Science, and other established publications. Always up for joining a good discussion, Christoph enjoys speaking at events or to other journalists and has appeared on LBC and Times Radio among other outlets. He believes in giving back to the community and has served on different consultative councils. He was also a National President for Junior Chamber International (JCI), a global organization founded in the USA. You can follow him on Twitter @cschwaigermt.