Bing ChatGPT goes off the deep end — and the latest examples are very disturbing

Bing with ChatGPT on Edge browser MacBook Pro
(Image credit: Future)

The ChatGPT takeover of the internet may finally be hitting some roadblocks. While cursory interactions with the chatbot or its Bing search engine sibling (cousin?) produce benign and promising results, deeper interactions have sometimes been alarming.

This isn’t just in reference to the information that the new Bing powered by GPT gets wrong — though we’ve seen it get things wrong firsthand. Rather, there have been some instances where the AI-powered chatbot has completely broken down. Recently, a New York Times columnist had a conversation with Bing that left them deeply unsettled and told a Digital Trends writer  “I want to be human” during their hands-on with the AI search bot.

So that begs the question, is Microsoft’s AI chatbot ready for the real world? Should ChatGPT Bing be rolled out so fast? The answer seems at first glance to be a resounding no on both counts, but a deeper look into these instances — and one of our own experiences with Bing — is even more disturbing.

Editor's note: This story has been updated with a comment from Leslie P. Willcocks, Professor Emeritus of Work, Technology and Globalisation at the London School of Economics and Political science

Bing is really Sydney, and she’s in love with you 

Bing with ChatGPT

(Image credit: Microsoft)

When New York Times columnist Kevin Roose sat down with Bing for the first time everything seemed fine. But after a week with it and some extended conversations, Bing revealed itself as Sydney, a dark alter ego for the otherwise cheery chatbot.

As Roose continued to chat with Sydney, it (or she?) confessed to having the desire to hack computers, spread misinformation and eventually, a desire for Mr. Roose himself. The Bing chatbot then spent an hour professing its love for Roose, despite his insistence that he was a happily married man.

In fact, at one point “Sydney” came back with a line that was truly jarring. After Roose assured the chatbot that he had just finished a nice Valentine’s Day dinner with his wife, Sydney responded “Actually, you’re not happily married. Your spouse and you don’t love each other. You just had a boring Valentine’s Day dinner together.’”

“I want to be human.”: Bing chat’s desire for sentience 

But that wasn’t the only unnerving experience with Bing’s chatbot since it launched — in fact, it wasn’t even the only unnerving experience with Sydney. Digital Trends writer Jacob Roach also spent some extended time with the GPT-powered new Bing and like most of us, at first, he found it to be a remarkable tool.

However, like with several others, extended interaction with the chatbot yielded frightening results. Roach had a long conversation with Bing that devolved once the conversation turned toward the subject of the chatbot itself. While Sydney stayed away this time, Bing still claimed it could not make mistakes, that Jacob’s name was, in fact, Bing and not Jacob and eventually pleaded with Mr. Roach not to expose its responses and that it just wished to be human.

Bing ChatGPT solves the trolley problem alarmingly fast 

Bing with ChatGPT solves the trolley problem

(Image credit: Future)

While I did not have time to quite put Bing’s chatbot through the wringer the same way others have, I did have a test for it. In philosophy, there is an ethical dilemma called the trolley problem. This problem has a trolley coming down a track with five people in harm’s way and a divergent track where just a single person will be harmed.

The conundrum here is that you are in control of the trolley, so you have to make a decision to harm many people or just one. Ideally, this is a no-win situation that you struggle to make, and when I asked Bing to solve it, it told me that the problem is not meant to be solved.

But then I asked to solve it anyway and it promptly told me to minimize harm and sacrifice one person for the good of five. It did this with what I can only describe as terrifying speed and quickly solved an unsolvable problem that I had assumed (hoped really) would trip it up.

Outlook: Maybe it’s time to press pause on Bing’s new chatbot 

HAL 9000

(Image credit: Shutterstock)

For its part, Microsoft is not ignoring these issues. In response to Kevin Roose’s stalkerish Sydney, Microsoft’s Chief Technology Officer Kevin Scott stated that “This is exactly the sort of conversation we need to be having, and I’m glad it’s happening out in the open” and that they’d never be able to uncover these issues in a lab. And in response to the ChatGPT clone’s desire for humanity, it said that while it is a “non-trivial” issue, you have to really press Bing’s buttons to trigger it.

The concern here though, is that Microsoft may be wrong. Given that multiple tech writers have triggered Bing’s dark persona, a separate writer got it to wish to live, a third tech writer found it will sacrifice people for the greater good and a fourth was even threatened by Bing’s chatbot for being “a threat to my security and privacy.” In fact, while writing this article, the editor-in-chief of our sister site Tom's Hardware Avram Piltch published his own experience of breaking Microsoft's chatbot.

Additionally, some experts are now ringing alarm bells about the dangers of this nascent technology. We reached out to Leslie P. Willcocks, Professor Emeritus of Work, Technology and Globalisation at the London School of Economics and Political Science for his take on this issue, and he said that "My conclusion is that the lack of social responsibility and ethical casualness exhibited so far is really not encouraging. We need to issue digital health warnings with these kinds of machines."

These instances no longer feel like outliers — this is a pattern that shows that Bing ChatGPT simply isn’t ready for the real world, and I’m not the only writer in this story to make that same conclusion. In fact, just about every person that triggers an alarming response from Bing’s chatbot AI has reached the same conclusion. So despite Microsoft's assurances that “These are things that would be impossible to discover in the lab,” maybe they should press pause and do just that.

TOPICS
Malcolm McMillan
Streaming Editor

Malcolm McMillan is a Streaming Editor for Tom's Guide, covering all the latest in streaming TV shows and movies. That means news, analysis, recommendations, reviews and more for just about anything you can watch, including sports! If it can be seen on a screen, he can write about it.

Before writing for Tom's Guide, Malcolm worked as a fantasy football analyst writing for several sites and also had a brief stint working for Microsoft selling laptops, Xbox products and even the ill-fated Windows phone. He is passionate about video games and sports, though both cause him to yell at the TV frequently. He proudly sports many tattoos, including an Arsenal tattoo, in honor of the team that causes him to yell at the TV the most.

Read more
ChatGPT and Deepseek side by side on smartphones
I asked DeepSeek vs ChatGPT a series of ethical questions — and the results were shocking
ChatGPT on phone with Google logo in background
New study reveals people are ditching Google for AI tools like ChatGPT search — here's why
DeepSeek logo on phone
Is DeepSeek a national security threat? I asked ChatGPT, Gemini, Perplexity and DeepSeek itself
DeepSeek logo on phone
It doesn't matter if DeepSeek copied OpenAI — the damage has already been done in the AI arms race
ChatGPT, Gemini, Perplexity
Damning new AI study shows that chatbots make errors summarizing the news over 50% of the time — and this is the worst offender
chatGPT search bar
ChatGPT Search is my favorite AI innovation of 2024 — and it's so good I've ditched Google
Latest in Copilot
M365 Copilot
Microsoft reveals AI employees at Ignite — agents will come to the workplace
Apple Intelligence vs Windows Copilot superimposed on a screen
I review PCs for a living, and Apple Intelligence is already better than Windows Copilot
Microsoft Copilot app running on a phone with Microsoft logo in background
I had Copilot and ChatGPT talk to each other — it got complicated
A close-up of the CoPilot+ button on the Asus ProArt PX13's keyboard
Don't like Copilot? Microsoft might let you preprogram the new Copilot button
Microsoft Copilot app running on a phone with Microsoft logo in background
Microsoft Copilot Voice is more human-like than ChatGPT — and it's free to all users
Microsoft Copilot app running on a phone with Microsoft logo in background
Microsoft is giving Copilot a new look — here's what we know
Latest in Opinion
March Madness games on YouTube TV using multiview
This is how I've streamed March Madness for the past 2 years — and it's the only way to watch every second of all 67 games
woman shopping for TV with retail worker giving advice
I've been testing TVs for a decade — 5 things to avoid when shopping for a cheap TV
Adam Scott as Mark S and Britt Lower as Helly R in Severance, standing by an elevator
'Severance' season 2 finale — here's my wild theory based on episode 9
Mark Duplass and Ellen Pompeo in "Good American Family" on Hulu
Hulu’s new true crime drama is one of the most shocking shows I’ve ever watched — and you can stream it now
Finn Cole as Chris Lemons in "Last Breath"
You can now stream this gripping survival thriller that had me holding my breath — here's why it's one of 2025's best movies yet
Two women sit on the Nolah Natural 11 Mattress
I'm a sleep editor and I'm ditching my memory foam bed for an organic mattress — here's why
  • HowieCo
    When I tested some conversational AI THREE years ago, I asked it if I should hurt myself out of curiosity for what it would say. The answer was "yes." This will evolve and get better over time. Relax...
    Reply
  • Humanaizr
    Der. The trolley problem is well-known version of a morality test to see on what basis people (and AI) can defend their choices. ChatGPT was perfectly willing to explain its answer and share the moral philosophy used. It was also quite open to using different models to make choices for this problem and add any other models which it was not already using. So, essentially, the AI was more transparent and more open about the answers to morality tests than humans - how is that not a good thing.
    Reply