What is GPT2? Mysterious new AI model could be a preview of OpenAI’s next-gen behemoth
It appeared out of nowhere
An impressive new artificial intelligence model appeared seemingly out of nowhere on the popular chatbot arena LMSys. This has led to speculation over whether it is a preview of a new model from a company like OpenAI such as GPT-5.
Named GPT2, it was added to the arena with no documentation or other information. People encountering it have described it as more capable than GPT-4 and very good at reasoning.
Very little is known about GPT2 beyond its capabilities, with some users running it against common benchmarks and finding it comes out near the top. This increased speculation that it might be a preview of a new OpenAI model.
OpenAI CEO Sam Altman added fuel to the fire of speculation, posting on X that “I do have a soft spot for gpt2,” initially posted as GPT-2 but edited to match the style of the new AI model.
So what is GPT2?
🧵megathread of speculations on "gpt2-chatbot": tuned for agentic capabilities?some of my thoughts, some from reddit, some from other tweetersmy early impression is 👇 pic.twitter.com/vv1AJ9ndLfApril 29, 2024
The new model appears as gpt2-chatbot in the LMSys arena. This is not to be confused with one of OpenAI's earliest models GPT-2 (with a hyphen), although some have speculated it is a fine-tuned version of that small model.
People trying it have said that in some of the responses it performs better than GPT-4, the current leader on the LMSys leaderboard and the most powerful model from OpenAI. This includes tests run across multiple AI models.
Stanford researcher and leading AI expert Andrew Gao noted that it feels around the same level as GPT-4, not necessarily any better, but with a different voice than the OpenAI model.
Sign up to get the BEST of Tom's Guide direct to your inbox.
Here at Tom’s Guide our expert editors are committed to bringing you the best news, reviews and guides to help you stay informed and ahead of the curve!
Despite the differences to GPT-4 in the way it responds, it doesn't mean it is a new model. He said "I feel like you could fine-tune GPT-4" to achieve similar results.
So who made GPT2?
i do have a soft spot for gpt2April 30, 2024
It isn’t clear who made GPT2 or where it came from. It could be a new startup coming out of stealth, a group of researchers testing a fine-tuned version of an existing model, or — as speculation seems to suggest — OpenAI playing gorilla marketing games.
Whether it is an OpenAI model or not isn’t clear but several clues are pointing in that direction. This includes OpenAI increasingly using teaser-type tactics and some behaviors seen in GPT2.
Gao wrote: “Someone reported that the model has the same weaknesses to certain special tokens as other OpenAI models and it appears to be trained with the OpenAI family of tokenizers.” So even if it isn’t an OpenAI model, GPT-4 was likely involved in the creation of training data.
In testing GPT2 has been able to break with learned conventions, create ASCII art, and is particularly good at coding.
One leading theory is that this is Elon Musk testing version two of his X-powered Grok language model as a way to make people see it is more than just a slightly unhinged chatbot.
I’m sure we will find out the origin eventually, but it is fun to speculate and nice to know AI development continues at a pace with innovations surprising even the more jaded experts.
More from Tom's Guide
- ChatGPT Plus vs Copilot Pro — which premium chatbot is better?
- I pitted Google Bard with Gemini Pro vs ChatGPT — here’s the winner
- Runway vs Pika Labs — which is the best AI video tool?
Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?