OpenAI knocks Gemini off the top of chatbot leaderboard with its new model

(Image credit: Shutterstock)

OpenAI's ChatGPT and Google Gemini have been duking it out for your chatbot prompts for months, but the competition is really starting to heat up.

While Claude took the top spot on AI benchmarking tool LMSys Chatbot Arena earlier this year, Gemini had been reigning supreme.

Now, though, a new version of ChatGPT-4o (20240808) has reclaimed the lead from its rivals with a score of 1314 — 17 points ahead of Gemini-1.5-Pro-Exp.

As per lmsys.org on X, "New ChatGPT-4o demonstrates notable improvement in technical domains, particularly in Coding (30+ point over GPT-4o-20240513), as well as in Instruction-following and Hard Prompts."

We called it

Exciting Update from Chatbot Arena!The latest @OpenAI ChatGPT-4o (20240808) API has been tested under "anonymous-chatbot" for the past week with over 11,000 community votes.OpenAI has now successfully re-claimed the #1 position, surpassing Google's Gemini-1.5-Pro-Exp with an… https://t.co/9lJlASI9UW pic.twitter.com/gxCDuBOi9NAugust 14, 2024

We spotted recently that OpenAI had rolled out a new version of GPT 4o in ChatGPT, and a different but similar model arrived for developers yesterday, too — the same day the Chatbot Arena results were revealed.

In our testing, we found it to be much snappier than prior versions, even building an entire iOS app in an hour using the latest version of the model.

That, paired with improvements to the Mac app, means it's been a bigger week than usual for ChatGPT users and OpenAI itself.

Still, with new models and revamped ones arriving all the time, there's every chance we'll see a reshuffle at the top of the pile in the coming months — or even weeks.

We have yet to see the launch of Google Ultra 1.5 or Claude Opus 1.5 and xAI's Grok 2 has made its first appearance in the top ten.

More from Tom's Guide

TOPICS

Lloyd Coombes is a freelance tech and fitness writer. He's an expert in all things Apple as well as in computer and gaming tech, with previous works published on TechRadar, Tom's Guide, Live Science and more. You'll find him regularly testing the latest MacBook or iPhone, but he spends most of his time writing about video games as Gaming Editor for the Daily Star. He also covers board games and virtual reality, just to round out the nerdy pursuits.

GET TG ACCESS QUICK

Black Friday Pros Start Early Join, Save, Play and Win!

We called it

More from Tom's Guide