I tested Grok-3 with 5 prompts — here’s what I like and don’t like about this chatbot

(Image credit: Shutterstock)

Grok-3 is the latest advanced AI chatbot developed by xAI, Elon Musk's AI. Launched today, Grok-3 boasts over ten times the computational power of its predecessor, Grok-2, and introduces enhanced reasoning capabilities designed to tackle complex tasks by breaking them into smaller components and self-verifying solutions before responding.

In early testing, Grok-3 has demonstrated superior performance compared to models like OpenAI's GPT-4o, Google's Gemini, and DeepSeek's V3. It offers two distinct reasoning modes: "Think," which displays Grok's thought process during problem-solving, and "Big Brain," intended for more computationally intensive tasks. Additionally, xAI has introduced Deep Search, a next-generation AI search engine, similar to the deep search agents of Perplexity, Gemini, and ChatGPT. A synthesized voice feature for Grok is rumored to be coming in the near future.

Access to Grok-3's functionalities is available through the X Premium Plus subscription, which recently increased in price ($40 per month), with an option for an advanced SuperGrok plan. Despite aiming for maximized truth-seeking capabilities, previous versions faced criticism for misinformation and offensive outputs. xAI plans to open-source Grok-2 in the near future.

I asked Perplexity to help me come up with 5 prompts that would test Grok-3. One of the reasons I test chatbots is to determine how reliable they are, interestingly enough, after noticing Grok-3 did not always site sources, I had to tweak the prompts to ensure I would be able to do my own research to fact-check the chatbot.

1. Advanced reasoning

Grok-3 screenshot — (Image credit: Future)

Prompt: "Explain the concept of quantum entanglement and its implications for information transfer."

Grok-3’s response effectively introduces quantum entanglement, describing how particles become interconnected such that the state of one directly influences the state of another, regardless of distance. The AI utilizes relatable analogies, such as comparing entangled particles to linked objects, which helps demystify complex quantum phenomena for anyone who may not have a deep understanding of the topic.

Grok-3 does not reference authoritative sources to support its claims. By incorporating citations from reputable scientific literature, users could feel more confident in the credibility and reliability of the information presented.

2. Deep Research

Prompt: "Provide a summary of the latest research on renewable energy sources published in the past month."

Grok-3 quickly pulled from a variety of sources and the response addresses multiple facets of renewable energy research, including solar and wind energy advancements, energy storage solutions, green hydrogen production, bioenergy developments, and grid integration strategies. This breadth shows an understanding of the diverse areas within the renewable energy sector.

Additionally, the mention of integrating AI and machine learning for better grid management indicates that the chatbot has understanding of the interdisciplinary approaches that may enhance renewable energy systems. However, while the response provides a general overview, it lacks references to specific studies, publications, or data from the past month (mid-January to mid-February 2025). Incorporating concrete examples or findings would strengthen the credibility and relevance of the summary.

While I can see the sources, it would be nice if Grok-3 pointed them out, specifically indicating where the information can be found. Plus, the AI’s use of phrases such as "research has likely continued" and "studies have probably built on efforts" suggest assumptions rather than definitive information, which undercut the authority of the response.

3. Big Brain mode

Prompt: "Analyze the economic impacts of implementing universal basic income in developed countries."

Grok-3’s response presents both positive and negative images of universal basic income (UBI), providing a nuanced perspective that acknowledges the complexity of the issue. This time, the AI referenced specific studies and pilot programs, which help ground the response in real-world examples that enhance the chatbot’s credibility.

Yet, the response uses words such as “might” and “could”, words which may undermine the strength of the chatbot’s authority on the subject. The response also does not fully address possible counterarguments and the analysis primarily focuses on immediate impacts rather than examining long-erm economic consequences.

Image generation with Aurora

Prompt: "Generate a photorealistic image of a futuristic cityscape at sunset."

The photorealistic quality of the image is extremely high with realistic lighting, reflections, and atmospheric effects, making them visually compelling and immersive. The futuristic architecture and color palette combine for a visually appealing scene while the various images provide diverse perspectives. From street-level shots to riverfront views, I appreciated the variety from different angles and viewpoints.

Yet, while the images maintain a futuristic aesthetic, the styles vary—some with a hyper-modern look and others appearing almost present-day with minimal enhancements. Although the buildings look futurist, the lack of innovative elements such as flying vehicles, would help to make this cityscape far more futuristic.

Multimodal input processing

Prompt: "Analyze global temperature changes over the past century and summarize the key trends."

Grok-3’s response correctly outlines the overall global temperature increase (~1.1–1.2°C) since the early 20th century, which aligns with findings from NOAA, NASA, and the IPCC (I had to do the manual legwork to check this). It also identifies two key warming phases (1910–1940 and post-1970), capturing historical variations in warming trends. The mention of Arctic amplification and differences in warming rates between land and ocean is scientifically well-supported.

The AI acknowledges that land regions have warmed faster than the global ocean average. However, it does not cite specific datasets or reports, which would improve credibility (I had to research myself to determine the accuracy). Including a reference to a widely accepted temperature dataset (e.g., HadCRUT, GISTEMP) would strengthen the argument. As with other responses, phrases like "typically observed" and "often cited" introduce a level of uncertainty.

Final thoughts

Grok-3 demonstrates strength in handling analytical and explanatory prompts across a range of complex topics, including climate science, economics, AI, and physics. While the responses are generally well-structured and informative, there are areas where the chatbot could use improvement. For example, if users choose to use Grok-3 for academic or professional research purposes, the chatbot still needs to be fact-checked. I had to do that during this experiment because Grok did not always site sources.

Although it often references major institutions such as NASA, it does not link directly to a specific report or database. Additionally, while some scientific uncertainty is valid, the chatbot often used tentative phrasing that weakened my confidence in its claims. Because of that scientific uncertainty and lack of specific data, I was left doubting the response. +

Finally, while Grok-3 mostly interpreted my image prompt, it did not fully incorporate the requested elements, which made me wonder how often it might do this with other prompts.

Overall, Grok-3 is a highly capable AI that excels at structuring information clearly and does a nice job at engaging users with appropriate dialogue. Is it good, yes. “Scary good?” not so fast, Elon.

More from Tom's Guide

Back to MacBook Air

Apple

Asus

Lenovo

8GB RAM

16GB RAM

128GB

512GB

1TB

Black

Grey

Silver

New

Refurbished

EMMC

SSD

Showing 10 of 59 deals

Filters☰

Apple MacBook Air M3

$849

View

Lenovo IdeaPad Duet 3

(128GB 8GB RAM)

$379.99

View

Asus Zenbook S 13 OLED

(13.3-inch 512GB)

$1,524.99

$1,189.99

View

Asus ROG Zephyrus G14 2023

$1,599.99

View

Lenovo IdeaPad Duet 3

$369.99

View

Asus Zenbook S 13 OLED

(OLED)

$1,399.99

View

Apple MacBook Pro 14-inch M4 (2024)

$1,599

View

Apple MacBook Pro 14-inch M4 (2024)

(512GB Black)

Asus ROG Zephyrus G14 2023

$3,299.99

View

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

I just tested the new Grok-3 with 5 prompts — here’s what I like and don’t like about this chatbot

1. Advanced reasoning

2. Deep Research

3. Big Brain mode

Image generation with Aurora

Multimodal input processing

Final thoughts

More from Tom's Guide

You must confirm your public display name before commenting

Please wait...

1. Advanced reasoning

2. Deep Research

Sign up to get the BEST of Tom's Guide direct to your inbox.

3. Big Brain mode

Image generation with Aurora

Multimodal input processing

Final thoughts

More from Tom's Guide

You must confirm your public display name before commenting