I'm a published author and I tested Claude 3.7 Sonnet vs Grok 3 at creative writing — here's my verdict
Should human authors be worried?

Have you ever wondered why Shakespeare is considered one of the best writers ever? He obviously has quite the fanbase even though he died in 1616. It’s still common to say “s/he’s no Shakespeare” when referring to a middle school essay or a Facebook rant.
According to ChatGPT, the one attribute that made Shakespeare different was his deep understanding of human nature. While many chatbots these days can craft compelling and accurate prose, it usually comes across as too generic.
As an author myself, I was curious if there was a way to generate written content that was far more readable and interesting, something that’s even worth publishing. I decided to test the two leading chatbot candidates to see if they could write with more creativity and flair.
Battle of the writing bots
I picked Claude 3.7 Sonnet because the developer, Anthropic, claims the bot is capable of writing with more nuance and a better command of tone. Plus, the word “sonnet” denotes a more Shakespearean bent. (After all, the bard popularized that word.)
I also tested Grok 3 because, as Elon Musk’s firm xAI claims, it’s supposed to be the world’s smartest chatbot. I like that Grok is also free for the full version, no strings attached.
Comparing the two chatbots was easy since both are free and do not have restrictions on how much text you can output or how many complex prompts you can use.
I’ve bumped up against the restrictions with bots like ChatGPT and Perplexity many times, but I didn’t want anything impeding my creative impulses. My main goal was simple: I wanted to find out if the two bots could write creatively and with enough emotion compared to my own writing.
Initial test results
Let me say this right off the bat: AI is all about imitation. If you’re hoping that an AI can write a compelling short story or even an entire novel that people will want to read but without using any parameters, you’ll be sorely disappointed.
When I asked both bots to write a short story from scratch without any specifics, the results were not outstanding. For example, Grok wrote a short story about someone named Mia who finds a clock that can turn back time. It wasn’t terrible, but lines like “the air smelled of rust and secrets” made me think Grok had been trained on schlocky young adult fiction.
When I asked both bots to write a short story from scratch without any specifics, the results were not outstanding.
I never mentioned which style to use or provided any sample text. It was a shot in the dark and the bot missed.
Claude wrote a short story from scratch as well that was a little more interesting, but lacked detail and charisma. This line at least held my attention: “The year 1901 had barely begun, and already it promised to be as brutal as the century that preceded it.”
Reading the finished story, I kept thinking there was no momentum and no narrative arc. It was a failing grade.
So far, everything I knew about chatbots writing fiction proved accurate — a little stale and generic, not something I’d read in my spare time, but passable and grammatically correct.
I wanted prose that was far more eclectic and varied with more pizzazz.
What worked much better
I found that asking both bots to rewrite one of my own stories worked much better. I used this old Medium post as a guide. I asked both bots to rewrite my story from scratch.
While I would not say the results dramatically improved on the story, they did have more originality. Claude renamed my story A Case of Grim Determination and imitated my Gothic style.
I tend to use long sentences in my fiction writing, and Claude nailed it: “Throttled with a silken cord wrapped thrice around his neck like a grotesque cravat, his skull fractured at the temple by violent impact, the viscous blood, black as midwinter midnight, had formed a tablecloth-sized Prussian emblem.”
That’s not bad! Chatbots work much better when we set the parameters and guide the outcomes, even if that takes extra work. The writing lacked a musicality about it, though — e.g., something that flows off the tongue and makes you want to keep reading.
Grok followed my cues as well. The rewrite was even more twisty and had longer sentences, some that droned on a bit too long: “At the alley’s end, beneath a rusted lantern swaying on its chain, lay a figure sprawled in a grotesque tableau — a lamplighter named Percival Grimsby Tate, late of Bristol some 120 miles westward, now in the lamentable condition of being quite dead.”
I didn’t like “late of Bristol” too much, but the output was a bit more eclectic than Claude.
Still, the story lacked that extra spark needed in creative writing to hook the reader.
The ultimate winner: humans
I felt a bit frustrated by the process, sensing that the bots were creating the written equivalent of AI slop. We’ve all seen it before — those AI-generated social media posts and images that are really good at gaining traction and tricking the algorithms, but are pale imitations of human creation.
Beauty is in the eye of the beholder, but also the creator. When we create an image or write a story, we’re crafting something that has detail, emotion, nuance, and suspense.
I tried one last test where I asked Claude and Grok to write a story in the style of Jane Austen. Claude wrote The Gentlemen’s Network and Grok wrote The Singular Enterprise of Mr. Percival Ashwood. I don’t think authors like Tilly Wallace have anything to worry about.
Considered one of the all-time greats, Austen is still heralded to this day because of her command of language and her ability to portray real emotion. Novels and short stories are pure fiction but they often tap into lasting truths about the human experience.
Both bots missed the mark — the results were too formulaic. Both Claude and Grok even used a line ripped from the pages of Pride and Prejudice. “It is a truth universally acknowledged, in the refined circles of 19th-century England, that a gentleman of vast fortune must be in want of some extraordinary pursuit to occupy his abundant leisure.” That’s quite lame!
The best correlation I can think of is when I’ve compared a Cybertruck to a Jaguar F-Type.
The best correlation I can think of is when I’ve compared a Cybertruck to a Jaguar F-Type.
We’ve all heard the jokes before — the electric truck looks like it was created in the game Minecraft. It lacks nuance and emotion, a large silver brick with stark angles. A Jaguar is a work of art. The curvatures were designed by someone with an eye for metalcraft.
Both Claude and Grok are certainly capable of writing prose, and were equal matches. They imitated my style, and they borrowed from Jane Austen. In the end, what they lacked was a heart — a way to tap into human emotions. To do that, you really need to be a human.
More from Tom's Guide
Sign up to get the BEST of Tom's Guide direct to your inbox.
Get instant access to breaking news, the hottest reviews, great deals and helpful tips.
John Brandon is a technologist, business writer, and book author. He first started writing in 2001 when he was downsized from a corporate job. In the early days of his writing career, he wrote features about biometrics and wrote Wi-Fi router and laptop reviews for LAPTOP magazine. Since 2001, he has published over 15,000 articles and has written business columns for both Inc. magazine and Forbes. He has personally tested over 10,000 gadgets in his career.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.

















