ChatGPT sucks at Wordle, it won’t even help me cheat

Wordle is open on a phone held in a hand. TRAIN and GRIND have been played, with the RIN in GRIND all green.

(Image credit: Mike Kemp / Getty Images)

Despite my job, I’m not very good at Wordle. But, like the good folks in Maine, I’m also not above cheating.

Rather than merely Google the answer, I asked an incredibly smart AI trained on over 500 billion words, the new Bing with ChatGPT, to help me. Easy, right? It turns out, not so much.

Beware spoilers for today’s Wordle.

After three guesses of my own left me with just two correctly placed ‘E’ letters but nothing else in a pattern of _E_E_, I was stumped. The new Bing is capable of so many things, including the ability to produce code for games in seconds. But in this case it seemed to struggle with counting to five. Asking it to help me with that Wordle setup, it initially only suggested four-letter words, “mere” and then “gene”.

After reminding it we were looking for a five letter word, the AI suggested a batch of possible solutions that were, again, incompatible.

None of the suggestions had an ‘E’ as the second letter.

Bings response to a wordle problem — (Image credit: Future)

How did Bard do?

In brief, just as bad. If not arguably worse.

I’ve tested Bing with ChatGPT vs Bard a lot and I’ve yet to see both handle a task so poorly. Asking to solve the Wordle in the hashtag format (e.g. #E#E#) Bard stated “The Wordle #384 is “Eager” which isn’t true (that one was Showy) or compatible with the answer we’re looking for.

I implored Bard to try again and it once again claimed to be answering Wordle #384 this time convinced the answer was “Voice”. Again this doesn’t fit the letters we already know.

After a painful conversation going round in circles, during which time Bard accused me of trying to trick it, I tried presenting the puzzle using underscores _E_E_. Its first answer was “EEE” but unless this was Dolphin speak for Beset (the correct answer) then Bard was wrong again.

Bard made several other incompatible guesses before concluding “You’re right. There is no five-letter word where the second letter is E and the fourth letter is E.” Somehow I managed to gaslight Google.

Why is AI so bad at Wordle?

Considering they are based on expensively-created Large Language Models like GPT4 and LaMDA, you would think these AI chatbots could handle Wordle no problem. But clearly, that's not the case. Which also may explain why those creating games with ChatGPT like Sumplete have stuck to working with numbers.

In a piece for The Conversation Professor of Computer Science at the University of Galway Michael G. Madden explains that because of the neural networks they work on, “all text inputs must be encoded as numbers and the process that does this doesn’t capture the structure of letters within words.”

He writes: "At the core of ChatGPT is a deep neural network: a complex mathematical function – or rule – that maps inputs to outputs. The inputs and outputs must be numbers. Since ChatGPT4 works with words, these must be 'translated' to numbers for the neural network to work with them.

"The translation is performed by a computer program called a tokenizer, which maintains a huge list of words and letter sequences, called “tokens”. These tokens are identified by numbers. A word such as 'friend' has a token ID of 6756, so a word such as 'friendship' is broken down into the tokens 'friend' and 'ship'. These are represented as the identifiers 6756 and 6729.

"When the user enters a question, the words are translated into numbers before ChatGPT4 even starts processing the request. The deep neural network does not have access to the words as text, so it cannot really reason about the letters."

How can this inaccuracy be fixed? There are two ways that future LLMs can overcome this. First, ChatGPT-4 knows the first letter of every word, so its training data could be augmented to include mappings of every letter position within every word in its dictionary.

The second is a more exciting and general solution. Future LLMs could generate code to solve problems like this…A recent paper demonstrated an idea called Toolformer, where an LLM uses external tools to carry out tasks where they normally struggle, such as arithmetic calculations."

For now, fellow Wordle cheats, let's stick to search engines and social media.

More from Tom's Guide

TOPICS

Andy is a freelance writer with a passion for streaming and VPNs. Based in the U.K., he originally cut his teeth at Tom's Guide as a Trainee Writer before moving to cover all things tech and streaming at T3. Outside of work, his passions are movies, football (soccer) and Formula 1. He is also something of an amateur screenwriter having studied creative writing at university.

How did Bard do?

Sign up to get the BEST of Tom's Guide direct to your inbox.

Why is AI so bad at Wordle?

More from Tom's Guide