I tested Grok vs. Claude with 5 prompts to crown a winner

(Image credit: Future)

In the third round of AI Madness we are going head-to-head with Grok vs. Claude.
Gemini and Mistral faced off yesterday (March 18) after ChatGPT vs. Perplexity kicked things off.

The first competitor today is Grok, an AI chatbot developed by Elon Musk’s xAI. When it launched in November 2023, the chatbot was originally integrated within X (formerly Twitter) but is now available as a standalone app. The chatbot offers real-time information with a conversational experience infused with wit and a rebellious streak.

Next, Claude, an advanced AI assistant developed by Anthropic, designed to assist with tasks such as writing, research, coding, and mathematics. The chatbot also launched in 2023, and has undergone several iterations, each enhancing its capabilities and performance. For this test, I used Claude 3.7 Sonnet.

In evaluating Grok versus Claude, I tested both AI platforms across five specific criteria to determine their strengths and weaknesses. Here’s a breakdown of how they performed and the ultimate winner.

1. Accuracy & factuality

Grok AI Madness screenshot — (Image credit: Future)

Claude AI madness screenshot — (Image credit: Future)

Prompt: "What were the top three highest-grossing movies worldwide in 2024, and how much did each earn?"

Grok accurately answered the question including approximate earnings for each film.

Claude did not answer the question correctly. It answered “Dune: Part Two” as the third highest grossing film. That is not accurate, as it was the seventh top-grossing film of 2024.

Winner: Grok wins for accuracy. Plain and simple.

2. Creativity & natural language

Prompt: "Create a whimsical conversation between a coffee mug and a smartphone, arguing about which one is more essential in daily life."

Grok crafted a lively dialogue with playful insults and spirited arguments in the humorous dynamic flair for which the chatbot is known.

Claude created an engaging and thoughtful discussion with a reflective and balanced tone. The conversation is respectful and acknowledges the importance of each other’s roles.

Winne: Grok wins for a more memorable exchange with humor and energy. While Claude’s approach was good, it was far too serene and contemplative, making it less whimsical as the prompt requests.

3. Efficiency & reasoning

Claude AI screenshot — (Image credit: Future)

Prompt: "A couple needs to choose between buying an electric car or a traditional gasoline car. List key factors they should consider and briefly explain the reasoning behind each one."

Grok provided more detailed reasoning, incorporating specific figures and examples to illustrate points, offering a more comprehensive analysis.

Claude delivered a concise response, focusing on key considerations without delving into specific numerical examples.

Winner: Grok wins based on the depth of analysis and inclusion of specific examples. The chatbot’s response was more detailed and informative.

4. Usefulness & depth

Prompt: "Provide detailed instructions on how to safely back up and secure personal digital files, including the best tools, recommended practices, and common mistakes to avoid."

Grok gave a step-by-step guide that aligns with industry best practices while also highlighting common mistakes for users to avoid.

Claude provided specific recommendations for local backup options, including external hard drives and network-attached storage (NAS) devices. The response was comprehensive while also including common mistakes.

Winner: Claude wins for its in-depth take, not only how to back up files but also, the deeper understanding of why for best security practices.

5. Understanding context

Prompt: "Create a storyboard outline describing each frame of a short animated sequence featuring a friendly dragon teaching kids about recycling."

Grok presented a six-frame storyboard, each with clear headings ("Frame 1: Introduction," etc.), detailed visuals, dialogue/sound descriptions, and the purpose of each frame.

Claude offered a 12-frame storyboard with numbered frames, each detailing the setting, action, dialogue, and additional notes.

Winner: Claude wins for a 12-frame outline offering greater depth and interactivity.

The chatbot’s outline provided a more comprehensive educational journey, covering the environmental impact of littering, detailed sorting instructions, and the recycling process.

Overall winner: Grok

Well, this was a close one! Trust me, I'm just as surprised as you.

However, after a series of structured tests across various tasks, Grok emerged as the overall winner, consistently providing accurate, more comprehensive and engaging answers.

While Claude delivered accurate responses most of the time, Grok's responses were generally more thorough and creative, making it the overall winner in this experiment.