I just tested the new Grok-3 with 5 prompts — here’s what I like and don’t like about this chatbot

Grok
(Image credit: Shutterstock)

Grok-3 is the latest advanced AI chatbot developed by xAI, Elon Musk's AI. Launched today, Grok-3 boasts over ten times the computational power of its predecessor, Grok-2, and introduces enhanced reasoning capabilities designed to tackle complex tasks by breaking them into smaller components and self-verifying solutions before responding.

In early testing, Grok-3 has demonstrated superior performance compared to models like OpenAI's GPT-4o, Google's Gemini, and DeepSeek's V3. It offers two distinct reasoning modes: "Think," which displays Grok's thought process during problem-solving, and "Big Brain," intended for more computationally intensive tasks. Additionally, xAI has introduced Deep Search, a next-generation AI search engine, similar to the deep search agents of Perplexity, Gemini, and ChatGPT. A synthesized voice feature for Grok is rumored to be coming in the near future.

Access to Grok-3's functionalities is available through the X Premium Plus subscription, which recently increased in price ($40 per month), with an option for an advanced SuperGrok plan. Despite aiming for maximized truth-seeking capabilities, previous versions faced criticism for misinformation and offensive outputs. xAI plans to open-source Grok-2 in the near future.

I asked Perplexity to help me come up with 5 prompts that would test Grok-3. One of the reasons I test chatbots is to determine how reliable they are, interestingly enough, after noticing Grok-3 did not always site sources, I had to tweak the prompts to ensure I would be able to do my own research to fact-check the chatbot.

1. Advanced reasoning

Grok-3 screenshot

(Image credit: Future)

Prompt: "Explain the concept of quantum entanglement and its implications for information transfer."

Grok-3’s response effectively introduces quantum entanglement, describing how particles become interconnected such that the state of one directly influences the state of another, regardless of distance. The AI utilizes relatable analogies, such as comparing entangled particles to linked objects, which helps demystify complex quantum phenomena for anyone who may not have a deep understanding of the topic.

Grok-3 does not reference authoritative sources to support its claims. By incorporating citations from reputable scientific literature, users could feel more confident in the credibility and reliability of the information presented.

2. Deep Research

Grok-3 screenshot

(Image credit: Future)

Prompt: "Provide a summary of the latest research on renewable energy sources published in the past month."

Grok-3 quickly pulled from a variety of sources and the response addresses multiple facets of renewable energy research, including solar and wind energy advancements, energy storage solutions, green hydrogen production, bioenergy developments, and grid integration strategies. This breadth shows an understanding of the diverse areas within the renewable energy sector.

Additionally, the mention of integrating AI and machine learning for better grid management indicates that the chatbot has understanding of the interdisciplinary approaches that may enhance renewable energy systems. However, while the response provides a general overview, it lacks references to specific studies, publications, or data from the past month (mid-January to mid-February 2025). Incorporating concrete examples or findings would strengthen the credibility and relevance of the summary.

While I can see the sources, it would be nice if Grok-3 pointed them out, specifically indicating where the information can be found. Plus, the AI’s use of phrases such as "research has likely continued" and "studies have probably built on efforts" suggest assumptions rather than definitive information, which undercut the authority of the response.

3. Big Brain mode

Grok-3 screenshot

(Image credit: Future)

Prompt: "Analyze the economic impacts of implementing universal basic income in developed countries."

Grok-3’s response presents both positive and negative images of universal basic income (UBI), providing a nuanced perspective that acknowledges the complexity of the issue. This time, the AI referenced specific studies and pilot programs, which help ground the response in real-world examples that enhance the chatbot’s credibility.

Yet, the response uses words such as “might” and “could”, words which may undermine the strength of the chatbot’s authority on the subject. The response also does not fully address possible counterarguments and the analysis primarily focuses on immediate impacts rather than examining long-erm economic consequences.

Image generation with Aurora

Grok-3 screenshot

(Image credit: Future)

Prompt: "Generate a photorealistic image of a futuristic cityscape at sunset."

The photorealistic quality of the image is extremely high with realistic lighting, reflections, and atmospheric effects, making them visually compelling and immersive. The futuristic architecture and color palette combine for a visually appealing scene while the various images provide diverse perspectives. From street-level shots to riverfront views, I appreciated the variety from different angles and viewpoints.

Yet, while the images maintain a futuristic aesthetic, the styles vary—some with a hyper-modern look and others appearing almost present-day with minimal enhancements. Although the buildings look futurist, the lack of innovative elements such as flying vehicles, would help to make this cityscape far more futuristic.

Multimodal input processing

Grok-3 screenshot

(Image credit: Future)

Prompt: "Analyze global temperature changes over the past century and summarize the key trends."

Grok-3’s response correctly outlines the overall global temperature increase (~1.1–1.2°C) since the early 20th century, which aligns with findings from NOAA, NASA, and the IPCC (I had to do the manual legwork to check this). It also identifies two key warming phases (1910–1940 and post-1970), capturing historical variations in warming trends. The mention of Arctic amplification and differences in warming rates between land and ocean is scientifically well-supported.

The AI acknowledges that land regions have warmed faster than the global ocean average. However, it does not cite specific datasets or reports, which would improve credibility (I had to research myself to determine the accuracy). Including a reference to a widely accepted temperature dataset (e.g., HadCRUT, GISTEMP) would strengthen the argument. As with other responses, phrases like "typically observed" and "often cited" introduce a level of uncertainty.

Final thoughts

Grok-3 demonstrates strength in handling analytical and explanatory prompts across a range of complex topics, including climate science, economics, AI, and physics. While the responses are generally well-structured and informative, there are areas where the chatbot could use improvement. For example, if users choose to use Grok-3 for academic or professional research purposes, the chatbot still needs to be fact-checked. I had to do that during this experiment because Grok did not always site sources.

Although it often references major institutions such as NASA, it does not link directly to a specific report or database. Additionally, while some scientific uncertainty is valid, the chatbot often used tentative phrasing that weakened my confidence in its claims. Because of that scientific uncertainty and lack of specific data, I was left doubting the response. +

Finally, while Grok-3 mostly interpreted my image prompt, it did not fully incorporate the requested elements, which made me wonder how often it might do this with other prompts.

Overall, Grok-3 is a highly capable AI that excels at structuring information clearly and does a nice job at engaging users with appropriate dialogue. Is it good, yes. “Scary good?” not so fast, Elon.

More from Tom's Guide

Category
Arrow
Arrow
Back to MacBook Air
Brand
Arrow
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Storage Type
Arrow
Condition
Arrow
Price
Arrow
Any Price
Showing 10 of 77 deals
Filters
Arrow
Show more
Amanda Caswell
AI Writer

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

Read more
Grok-vs-Perplexity-vs-Gemini-AI-logos
I just tested AI deep research on Grok-3 vs Perplexity vs Gemini — here's the winner
Grok vs Deepseek
I just tested Grok-3 vs DeepSeek with 7 prompts — here’s the winner
Grok vs Chat GPT logos
I just tested ChatGPT deep research vs Grok-3 with 5 prompts — here's the winner
ChatGPT app on iPhone
I just tested ChatGPT-4.5 with 5 prompts — the good, the bad and the weird
ChatGPT
I just tested ChatGPT's new o3-mini model with 7 prompts to rate its problem-solving and reasoning capabilities — and it blew me away
Grok logo on a phone handset on a keyboard
What is Grok? — everything you need to know about xAI's chatbot
Latest in AI
The Dnsys X1 Exoskeleton being worn
I tested an AI exoskeleton to help treat my immune arthritis — here’s what happened
Squid Game star Lee Jung Jae appearing in an advert for Perplexity
Perplexity just brought in a 'Squid Game' star to convince you to ditch Google
Man and woman side by side lifting dumbbells in a plank position during a weights workout
I tried Gemini's new 'Gems' feature to create my own custom AI fitness coach — here's what happened
Grok vs Claude logo on a laptop
I'm a published author and I tested Claude 3.7 Sonnet vs Grok 3 at creative writing — here's my verdict
Apple Peek Performance
Apple makes a move to revive its Siri revamp — and the Vision Pro boss could play a part
A TV with the Netflix logo sits behind a hand holding a remote
I tried these 7 ChatGPT prompts to supercharge my Netflix viewing experience
Latest in Features
Troubadour Apex 3.0 Backpack
I tested this laptop backpack for 6 months — and it may be the best purchase I’ve ever made
Roon
Forget Spotify HiFi — I made a hi-res streaming service that's just for me
Soundcore AeroClip open-ear earbuds in champagne mist against a blue backdrop
I ditched my AirPods for these budget open-ear earbuds while running for a week — I won’t be going back
A TV with the Netflix logo sits behind a hand holding a remote
I tried these 7 ChatGPT prompts to supercharge my Netflix viewing experience
Innocn 49QR1 on desk
I ditched my dual monitor setup for this ultrawide OLED monitor — and it's a total game changer
Washing machine in laundry room
7 laundry myths debunked by the experts