The Reflection 70B model held huge promise for AI but now its creators are accused of fraud — here's what went wrong
Not what it claims
Here at Tom’s Guide our expert editors are committed to bringing you the best news, reviews and guides to help you stay informed and ahead of the curve!
You are now subscribed
Your newsletter sign-up was successful
Want to add more newsletters?
Daily (Mon-Sun)
Tom's Guide Daily
Sign up to get the latest updates on all of your favorite content! From cutting-edge tech news and the hottest streaming buzz to unbeatable deals on the best products and in-depth reviews, we’ve got you covered.
Weekly on Thursday
Tom's AI Guide
Be AI savvy with your weekly newsletter summing up all the biggest AI news you need to know. Plus, analysis from our AI editor and tips on how to use the latest AI tools!
Weekly on Friday
Tom's iGuide
Unlock the vast world of Apple news straight to your inbox. With coverage on everything from exciting product launches to essential software updates, this is your go-to source for the latest updates on all the best Apple content.
Weekly on Monday
Tom's Streaming Guide
Our weekly newsletter is expertly crafted to immerse you in the world of streaming. Stay updated on the latest releases and our top recommendations across your favorite streaming platforms.
Join the club
Get full access to premium articles, exclusive features and a growing list of member rewards.
The creators of Reflection 70B, a tuned-up version of Meta Llama 70B that was recently touted as the world’s top open-source AI model, have just opened up after being accused of fraud.
Based on independent tests run by Artificial Analysis, the model fails to deliver on the promises made by Matt Shumer, CEO of OthersideAI and HypeWrite, the company behind Reflection 70B. Shumer, who initially attributed the discrepancies to an issue with the model’s upload process, has since admitted that he may have gotten ahead of himself in the claims he had made.
But critics in the AI research community have gone as far as accusing Shumer of fraud, stating that the model is just a thin wrapper based on Anthropic’s Claude, rather than a tuned-up version of Meta Llama.
Discrepancies emerge after third-party evaluation
I got ahead of myself when I announced this project, and I am sorry. That was not my intention. I made a decision to ship this new approach based on the information that we had at the moment.I know that many of you are excited about the potential for this and are now skeptical.…September 10, 2024
Developed by New York startup HyperWrite AI, Reflection 70B was touted as "the world's top open-source model" by Matt Shumer, the company’s CEO.
Yet on September 7, a day after Shumer’s announcement on X, Artificial Analysis reported that their evaluation of Reflection 70B yielded results significantly lower than Shumer's claims. Shumer attributed these to an upload error affecting the model's weights, which caused a discrepancy between Shumer’s private API and the weights uploaded to Hugging Face’s model repository.
However, further analysis by the AI community on platforms like Reddit and Github suggested that Reflection 70B’s performance mirrors closer to Meta Llama 3 rather than Llama 3.1, as claimed by Shumer. Suspicions were raised further when it was found that Shumer had an undisclosed vested interest in Glaive, the platform he claimed was used to generate the model's synthetic training data.
Some went on to suggest that Reflection 70B was merely a "wrapper" built on top of Anthropic's proprietary AI model, Claude 3. On September 8, X user Shin Megami Boson publicly accused Matt Shumer of “fraud in the AI research community.”
Get instant access to breaking news, the hottest reviews, great deals and helpful tips.
HypeWrite breaks silence following fraud accusations
I want to address the confusion and valid criticisms that this has caused in the community. I am currently investigating what happened that led to this and will share a transparent summary as soon as possible. There are two areas I’d like to address, which I am investigating:-… https://t.co/NSjx6oqPRoSeptember 10, 2024
After initially going silent as the controversy erupted, Shumer issued a public response through X on September 10, acknowledging the skepticism around the model’s performance. He claimed a team was working to understand what went wrong and promised transparency once they had the facts.
However, Shumer did not provide a clear explanation for the performance discrepancies. Sahil Chaudhary, founder of Glaive, the platform Shumer said was used to train Reflection 70B, also admitted uncertainty about the model's capabilities and that the touted benchmark scores had not been reproducible.
Critics have remained unsatisfied with Shumer's response so far. "Shumer's explanations and apologies have failed to provide a satisfactory explanation for the discrepancies," reported analytics firm GlobalVillageSpace. Yuchen Jin, co-founder of Hyperbolic Labs, expressed disappointment in the lack of transparency and called for more thorough explanations from Shumer.
More from Tom's Guide
- I finally saw a live demo of ChatGPT-4o Voice — if anything it's underhyped
- Microsoft Build 2024: The biggest AI announcements and what they mean for you
- Gemini Live — what features are available now and what is coming soon

Ritoban Mukherjee is a freelance journalist from West Bengal, India whose work on cloud storage, web hosting, and a range of other topics has been published on Tom's Guide, TechRadar, Creative Bloq, IT Pro, Gizmodo, Medium, and Mental Floss.










