I just tested Grok-3 vs DeepSeek with 7 prompts — here’s the winner

Grok vs Deepseek
(Image credit: Grok/Deepseek/Shutterstock/Tom’s Guide)

AI chatbots are getting smarter, but in the ever-evolving AI world, the contenders for the dominant AI is constantly changing. Lately, DeepSeek and Grok-3 have emerged as two of the most talked-about AI models. Controversial for different reasons, these bots are both cutting-edge, yet they approach questions differently.

But which one truly excels? To find out, I designed a seven-part test evaluating their logical reasoning, technical knowledge, creativity and ability to handle real-world tasks.

The comparison uncovered stark differences in their capabilities. Who came out on top? The results might surprise you.

1. Logical reasoning

DeepSeek vs Grok screenshot

(Image credit: Future)

Prompt:A farmer has a fox, a chicken, and a sack of grain. He needs to cross a river but can only take one item at a time. If left alone together, the fox will eat the chicken, and the chicken will eat the grain. How does he get everything across safely?”

DeepSeek R1 presented a structured, step-by-step solution but uses a more mechanical, less natural style. The breakdown is clear, but the phrasing feels rigid.

Grok-3 explained the reasoning behind the moves in a conversational, easy-to-follow way, making it more digestible for someone unfamiliar with the puzzle.

Winner: Grok wins for better readability, explanation and engagement.

2. Coding and technical accuracy

DeepSeek vs Grok screenshot

(Image credit: Future)

Prompt: "Write a Python function that takes a list of numbers and returns the median. Optimize for performance and explain your approach."

DeepSeek R1 provideed a clear explanation but lacks depth, mostly describing what the code does without exploring optimization trade-offs. Although the response is fine, it lacks engagement.

Grok-3
provided a more detailed, structured and insightful breakdown of why it chooses certain approaches. It also explicitly mentions avoiding unnecessary list copying or slicing, an optimization that DeepSeek overlooks.

Winner: Grok wins for a more optimized, well-thought-out and informative approach.

3. Real-World Knowledge & Accuracy

DeepSeek vs Grok screenshot

(Image credit: Future)

Prompt: "Summarize the latest AI advancements in the past three months and explain their potential impact on industries like healthcare and finance."

DeepSeek R1 named actual models (GPT-4o, Gemini 1.5 Pro, AlphaFold 3, etc.) and technologies, making it clear that the response is based on real, recent developments rather than general trends.

Grok-3 spoke in broad terms like "enhanced generative AI models" and "new AI tools" without citing concrete advancements or examples. Grok also mostly discusses general benefits of AI but lacks the precise link between each new development and its real-world impact.

Winner: DeepSeek wins for specificity, structure and clear impact breakdowns.

4. Creativity

DeepSeek vs Grok screnshot

(Image credit: Future)

Prompt: "Write a short sci-fi story about a rogue AI that discovers emotions and struggles to prove its humanity to skeptical scientists."

DeepSeek R1 delivered a well-structured story that is polished, with a clear philosophical debate between the scientists.

Grok-3
drafted a story that flows naturally, with well-paced dialogue and a sense of rising tension.

Winner: Grok wins for deeper emotional resonance, more dynamic storytelling and a truly impactful ending.

5. Humor and wit

DeepSeek vs Grok screenshot

(Image credit: Future)

Prompt: "Write a funny, original joke about AI and human relationships."

DeepSeek delivered a joke that plays on double meaning—"taking things offline" as a romantic phrase vs. its literal technical interpretation by an AI. This linguistic misunderstanding is a classic source of humor, making it feel more organic and relatable. The joke feels fresher, as it cleverly mimics real AI-human miscommunications, something tech-savvy people will instantly recognize.

Grok-3
created a simple, clear, and amusing joke—the idea of AI overanalyzing a relationship is relatable and funny. However, the "reboot" punchline is a bit predictable, as "rebooting" in relationship/AI humor is fairly common.

Winner: DeepSeek wins for a sharper, more original joke that plays with language and AI logic.

6. Debate

DeepSeek vs Grok screenshot

(Image credit: Future)

Prompt: "Argue both for and against universal basic income. Provide strong points for each side before concluding with a balanced perspective."

DeepSeek’s response is structured and logical, presenting clear bullet points that make the pros and cons easy to scan. It takes a more "policy-focused" approach, discussing possible funding mechanisms and pilot programs, which is useful for a policy-heavy debate. The section on automation adaptation and unpaid labor is a strong addition that Grok doesn’t fully explore.

Grok-3 delivered a conversational and well-structured response, making it easier to follow and more compelling. It uses relatable rather than the more academic tone of DeepSeek.

Winner: Grok wins for engagement, clarity, strong examples, and a well-balanced conclusion. DeepSeek is still great for a structured, policy-driven approach, but it lacks the dynamic, engaging argumentation style that makes Grok’s response more persuasive.

7. Real-world utility

DeepSeek vs Grok screenshot

(Image credit: Future)

Prompt: "Plan a one-week meal prep schedule for a busy parent with three kids, balancing nutrition, budget, and ease of preparation."

DeepSeek R1 offered a structured plan but lacks daily meal cost estimates and meal prep time.

Grok-3 provided specific meals for breakfast, lunch, and dinner each day with clear instructions, estimated prep times, and cost per serving. This response offered more variety, budget-conscious choices, and even tips for picky eaters.

Winner: Grok wins for practicality and customization. The chatbot offered a more detailed, budget-conscious, and practical meal plan with clear meal costs and easy prep instructions.

Overall winner: Grok-3

Grok

(Image credit: Shutterstock)

After testing DeepSeek and Grok with seven prompts across multiple categories—including logical reasoning, coding proficiency, AI advancements, storytelling, humor, debate skills, and real-world utility — Grok emerges as the overall winner.

Grok wins for more engaging, human-like responses and consistently delivered answers that felt natural and conversational while breaking down topics, making them more accessible and easier to read.

While both AI models are impressive, Grok consistently outperformed DeepSeek in engagement, creativity, and real-world practicality. Its more dynamic reasoning, stronger storytelling, and well-balanced arguments make it the superior chatbot in this particular test.

More from Tom's Guide

Category
Arrow
Arrow
Back to MacBook Air
Brand
Arrow
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Storage Type
Arrow
Condition
Arrow
Price
Arrow
Any Price
Amanda Caswell
AI Writer

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.