Reflection is a new AI model that can run on a good laptop and still beat GPT-4o in tests

Adobe Firefly AI image of a Llama looking in a mirror
(Image credit: Adobe Firefly AI image/Future)

There's a new heavyweight contender in the world of open-source AI models. Reflection 70B, developed by startup HyperWrite, is making waves with its innovative "reflection" mechanism that aims to fix a core problem with large language models — hallucination

In early benchmarks, this souped-up version of Meta's Llama 3.1-70B Instruct architecture is already outperforming OpenAI's GPT-4o.

Reflection 70B introduces a novel approach to enhancing the reasoning capabilities and accuracy of language models. By assessing its own outputs before delivering a final response, Reflection 70B can detect errors in its reasoning and correct them on the fly. The result is a powerful open-source model that's pushing the boundaries of what's possible with AI today.

What is Reflection 70B?

Reflection 70B is a groundbreaking open-source language model developed by Matt Schumer and his team at HyperWrite. The full name is Reflection Llama 70B as it is based on the Meta Llama architecture.

The model's name is derived from two things, its size of 70 billion parameters, and its ability to "reflect" on its own outputs before providing a final answer. This reflection process is designed to enhance the model's reasoning capabilities and improve the overall accuracy of its responses.

Reflection 70B stands out from other generative AI models due to its unique error identification and correction capabilities. The model's reflection mechanism allows it to assess the accuracy of its generated text before delivering outputs to the user. This is achieved.

0 through a technique called reflection tuning, which enables the model to detect errors in its own reasoning and correct them in real time.

Moreover, Reflection 70B has demonstrated exceptional performance across various benchmarks, including MMLU and HumanEval, consistently outperforming models from Meta's Llama series and competing closely with top commercial models like GPT-4o. The model recorded an impressive 99.2% accuracy on the GSM8k benchmark, which evaluates math and logic skills.

About HyperWrite, the creators of Reflection

HyperWrite, the company behind Reflection 70B, is an AI writing start-up led by Matt Shumer. The company offers a Chrome extension that uses AI to automate and accelerate the writing process, providing services such as autocompletion, text generation, and sentence rephrasing.

HyperWrite has raised a total of $5.4 million in funding from 10 investors, including Madrona Venture Group and Active Capital. The company's focus on developing powerful AI writing tools has positioned it at the forefront of the AI industry.

Going forward, the future looks bright for Reflection 70B and HyperWrite. Shumer has already revealed plans for an even larger model, Reflection 405B, set to launch in the near future. This more powerful model is expected to push the boundaries of open-source AI even further.

Additionally, HyperWrite is working on integrating the Reflection 70B model into its primary AI writing assistant product. This integration will provide users with access to the model's advanced capabilities, enhancing their writing experience and productivity.

Want to take Reflection 70B for a spin right now? Following an announcement on X, the AI model is available as a free online demo on Railway. If you have a decent gaming laptop, you can also download the model for offline use through Hugging Face.

More from Tom's Guide

Category
Arrow
Arrow
Back to MacBook Air
Brand
Arrow
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Storage Type
Arrow
Condition
Arrow
Price
Arrow
Any Price
Showing 10 of 46 deals
Filters
Arrow
Show more
Ritoban Mukherjee

Ritoban Mukherjee is a freelance journalist from West Bengal, India whose work on cloud storage, web hosting, and a range of other topics has been published on Tom's Guide, TechRadar, Creative Bloq, IT Pro, Gizmodo, Medium, and Mental Floss.

Read more
Meta Llama 3.1
Llama 4 will be Meta's next-generation AI model — here's what to expect
OpenAI logo
OpenAI ChatGPT-4.5 is here and it's the most human-like chatbot yet — here's how to try it
DeepSeek R1 illustrations
DeepSeek R1 is the new Chinese AI model threatening OpenAI — what you need to know
Sam Altman
OpenAI takes aim at authors with a new AI model that's 'good at creative writing'
ChatGPT search interface
ChatGPT's powerful 'Deep Research' upgrade got an open source replica — in just 24 hours
Copilot, Gemini, Claude
I test AI chatbots for a living and these are the best ChatGPT alternatives
Latest in AI
A nervous woman looking at her phone
Is ChatGPT making us lonely? MIT/OpenAI study reveals possible link
AI in man's hand
AI
AI Madness faceoff logo
I just tested Grok vs. DeepSeek with 7 prompts — here's the winner
ChatGPT on iPhone
ChatGPT was down — updates on quick outage
Claude AI on phone sitting on keyboard
Claude 3.7 Sonnet now supports real-time web searching — but there's a catch
The Dnsys X1 Exoskeleton being worn
I tested an AI exoskeleton to help treat my immune arthritis — here’s what happened
Latest in News
iPhone 16 with Apple Intelligence logo for iOS 18.1
iOS 18.4: All the newest Apple Intelligence features coming to your iPhone
Maria Debska in "Just One Look" now streaming on Netflix
3 best Netflix shows in March you haven't watched yet
Wolfenstein: The Old Blood
Amazon is giving away a ton of free games for its Big Spring Sale — here’s how to claim yours
A TV with the Netflix logo sits behind a hand holding a remote
Netflix is rolling out a big video quality upgrade — what you need to know
Choi Hyun-Wook, Hong Kyung, and Park Ji-hoon in "Weak Hero Class 1" now streaming on Netflix
This action-packed K-drama is now streaming on Netflix — and now’s the time to binge-watch before season 2
OnePlus 13 back, leaning against blue wall
OnePlus 13T could come with an even bigger battery than OnePlus 13 — this is incredible
  • stwil
    Model appears to be a fake, wrapper on claude 😞
    Reply
  • dozoy
    Whoever thinks you could run this on a laptop needs their head examined 🤣

    Even a 4bit quant wouldn't get you close
    Reply