ChatGPT finally has competition — Google Bard with Gemini just matched it with a huge upgrade

(Image credit: Shutterstock)

Google Bard with Gemini just equaled ChatGPT’s performance on a popular chatbot arena, coming second on the leaderboard just behind GPT-4-Turbo, OpenAI’s most advanced model.

Powered by a newly updated version of the new Gemini Pro artificial intelligence model, Bard has seen a marked increase in performance since the model’s release in December. My own test of Bard vs the free version of ChatGPT saw Bard come out on top.

The Large Model Systems Organization (LMSYS) Chatbot Arena pits the leading AI models against one another with humans judging and rating the output. It includes both closed-source models like GPT-4 or Gemini Pro, as well as open-source AI like Meta’s Llama 2.

This is the first time Bard has beaten the base version of GPT-4, which powers the premium version of both ChatGPT and Microsoft Copilot. Jeff Dean, Chief Scientist at Google DeepMind wrote on X that the recent success was down to a new version of Gemini Pro called “scale”.

How does the chatbot arena work?

LMSys Chatbot Arena — (Image credit: LMSys)

Like any battleground, the Chatbot Arena puts two models in the ring head-to-head, the difference is you have no idea of the identity of the competitors.

You write a prompt, it gets sent to two anonymous models, and after the response is shown you have to determine which is the better response of the two.

Only the underlying system knows which model won each round, and the collective human decisions are used to create the leaderboard. So far more than 200,000 votes have been cast.

Swipe to scroll horizontally

LMSys Chatboat Arena leaderboard
Rank	Model	Organization	License
1	GPT-4 Turbo	OpenAI	Proprietary
2	Bard (Gemini Pro)	Google	Proprietary
3	GPT-4-0314	OpenAI	Proprietary
4	GPT-4-0613	OpenAI	Proprietary
5	Mistral Medium	Mistral	Proprietary
6	Claude-1	Anthropic	Proprietary
7	Claude-2.0	Anthropic	Proprietary
8	Mixtral-8x7b	Mistral	Apache 2.0
9	Gemini Pro (Dev API)	Google	Proprietary
10	Claude 2.1	Anthropic	Proprietary

How does Bard compare with the competition?

Gemini Pro on its own, as an API and not part of Bard with live internet access comes in at number 9 and an earlier version of Gemini Pro is at number 12 in the current leaderboard.

While Bard with Gemini Pro has come in strong at number two, three of the top five are all versions of OpenAI's GPT-4 model with GPT-4-Turbo taking the top spot.

Nine of the top ten models are proprietary, with French startup Mistral’s Mixtra-8x7b the only open-source large language model in the upper region of the table.

Anthropic's Claude and Mistral’s Mistral Medium, a proprietary version of the Mixtral model, are the only non-Google or OpenAI models in the top ten.

What is new with Google Bard?

Bard, powered by the Gemini Pro-scale model, debuts at the #2 position on the independent lmsys leaderboard. 🔥Give it a try at https://t.co/m9D7JYUfls. Bard is much better & has many more capabilities since its debut in March, thanks to everyone on the Bard/Gemini teams! https://t.co/rOPTbOE4v8January 26, 2024

According to Jeff Dean Bard is now “powered by the Gemini Pro-scale model, an update on the base version of Gemini Pro and this change allowed it to enter at number two. This is still based on the mid-tier version of the Gemini family of LLMs.

Google is said to be releasing Gemini Ultra, its most advanced and natively multimodal model at some point this year.

This will power a new premium version of its chatbot called Bard Advanced and, likely, this will significantly outperform event GPT-4-Turbo.

Dean wrote on X: “The race is heating up like never before! Super excited to see what's next for Bard + Gemini Ultra release.”

More from Tom's Guide

Back to MacBook Air

Apple

Asus

Lenovo

Intel Core M3

Intel Pentium

8GB RAM

16GB RAM

128GB

256GB

512GB

1TB

13.3-inch

13.6-inch

Grey

Silver

EMMC

SSD

Showing 10 of 34 deals

Filters☰

Apple MacBook Air M2 2022

(13.6-inch 256GB)

$889.95

View

Asus Zenbook S 13 OLED

(13.3-inch 512GB)

$1,524.99

$1,189.99

View

Lenovo IdeaPad Duet 3

$369.99

View

Asus ROG Zephyrus G14 2023

$1,599.99

View

Asus Zenbook S 13 OLED

(OLED)

$1,399.99

View

Apple MacBook Pro 14-inch M3 (2023)

(1TB Intel Core M3)

Our Review

☆☆☆☆☆

$2,399

$1,998.98

View

Apple MacBook Air M2 2022

$1,499

View

Apple MacBook Pro 14-inch M3 (2023)

(1TB SSD)

Our Review

☆☆☆☆☆

$1,744.95

View

Apple MacBook Air M2 2022

$999

$889.95

View

Lenovo IdeaPad Duet 3

(128GB 8GB RAM)

$379.99

View

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?

1 Comment Comment from the forums

stevenstarr

I tested Google's Bard with Gemini Pro and requested it to compose a Python tutorial for the Django framework. Unfortunately, it failed to generate anything beyond a few brief paragraphs. Moreover, the information it presented was either incomplete or outright incorrect. On the other hand, ChatGPT 3.5 faces no such challenges; it effortlessly tackles this task, producing over 20 pages of accurate content with minimal errors, typically confined to sporadic and often unnecessary steps.
Reply

How does the chatbot arena work?

Sign up to get the BEST of Tom's Guide direct to your inbox.

How does Bard compare with the competition?

What is new with Google Bard?

More from Tom's Guide