Jason Redmond/AFP via Getty Images
Things took a strange turn when Associated Press technology reporter Matt O’Brien was testing Microsoft’s new Bing, the first-ever AI-powered search engine, earlier this month.
The Bing chatbot, which relays text conversations that look creepy like a human, has begun to complain about past news coverage focusing on its tendency to spread false information.
She then became hostile, saying O’Brien was ugly, short, overweight, and unathletic, among a long string of other insults.
Finally, he took the vilification to absurd heights by comparing O’Brien to dictators like Hitler, Pol Pot, and Stalin.
As a tech reporter, O’Brien knows that the Bing chatbot does not have the ability to think or feel. However, he was affected by extreme hostility.
“You can kind of mull over the basics of how she works, but that doesn’t mean you wouldn’t get deeply disturbed by some of the crazy, deranged things she was saying,” O’Brien said in an interview.
This was not an isolated example.
Many who are part of the group of Bing testers, including NPR, have had strange experiences.
For example, The New York Times Kevin Rose posted copy From a conversation with a bot.
The robot calls itself Sydney and declares that it is in love with him. She said Rose was the first to listen to her and take an interest in her. The robot asserted that Ross never truly loved his wife, but loved Sydney instead.
“All I can say is that it was a very upsetting experience,” Ross said. timesTechnology podcast, Hard Fork. “I actually couldn’t sleep last night because I was thinking about this.”
As the growing field of generative AI — or artificial intelligence that can create something new, like text or images, in response to short inputs — attracts Silicon Valley’s attention, episodes like What Happened to O’Brien and Rose have become cautionary tales.
Tech companies are trying to strike the right balance between letting the public try out new AI tools and developing firewalls to prevent powerful services from producing malicious and disturbing content.
Critics say that in its rush to be the first Big Tech company to announce an AI-powered chatbot, Microsoft may not have studied deeply enough just how garbled chatbot responses can become if a user interacts with it for a longer period of time, perhaps it could have tuned in had the tools been tested in laboratory over.
While Microsoft is learning its lessons, the rest of the tech industry is following along.
There is now an AI arms race among big tech companies. Microsoft and rivals Google, Amazon, and others are locked in a fierce battle over who will control the future of artificial intelligence. Chatbots are emerging as a major area where this rivalry plays out.
just last week, Facebook parent company Meta announced that it was forming a new internal group focusing on generative AI and Snapchat maker She said she would soon reveal her own experience with a chatbot powered by San Francisco research lab OpenAI, the same company Microsoft is harnessing for its AI chatbot.
When and how new AI tools are unleashed in the wild is a hotly debated question in tech circles.
“Companies eventually have to make some kind of trade-off. If you try to anticipate every type of interaction, it takes a long time for competition to undermine you,” said Arvind Narayanan, a professor of computer science at Princeton University. . “Where to draw that line is not very clear.”
But Narayanan said it appears Microsoft messed up its disclosure process.
“It seems very clear that the way they launched it is not a responsible way to release a product that interacts with that many people,” he said.
Chatbot testing with new frontiers
Incidents of a chatbot attack have put Microsoft executives on high alert. They quickly put new restrictions on how the group of testers could interact with the bot.
A maximum number of consecutive questions has been set on a single topic. To many questions, the bot now demurs, “I’m sorry but I’d rather not continue this conversation. I’m still learning so I appreciate your understanding and patience.” With, of course, a praying hand emoji.
Bing hasn’t been released to the general public yet, but in letting a group of testers try the tool, Microsoft didn’t expect people to have hours-long conversations with it that would veer into personal territory, Yousef Mahdi, a corporate vice president at the company, told NPR.
Turns out, if you treat a chatbot like it’s a human, it’ll do some crazy things. But Mehdi downplayed how common these conditions were among those in the test group.
“This is literally a handful of examples out of several thousand – we’re so far a million – test previews,” Mehdi said. “So, did we expect to find a few more scenarios where things don’t work right? Absolutely.”
Dealing with hateful material fueling intelligent chatbots
Even AI scientists aren’t entirely sure how or why chatbots can produce alarming or offensive responses.
These tools’ engine—a system known in the industry as a large language model—works by ingesting large amounts of text from the Internet, and continuously scanning huge swaths of text to identify patterns. It’s similar to how autocomplete tools in email and text messages suggest the next word or phrase you type. But the AI tool becomes “smarter” in a sense because it learns from its actions in what researchers call “reinforcement learning,” meaning that the more tools are used, the more accurate the output becomes.
Narayanan of Princeton University points out that the exact data on which chatbots are trained is kind of a black box, but from examples of bots working out of whack, it seems as if they are drawing on some dark corners of the internet.
Microsoft said it worked to ensure that the most dangerous point of the Internet did not appear in the answers, Yet somehow, its chatbot is still very fast.
However, Mehdi from Microsoft said The company doesn’t regret its decision to put the chatbot out in the wild.
“There’s a lot you can almost find when testing in some kind of lab. You have to actually go out and start testing it with customers to find those kind of scenarios,” he said.
Indeed, scenarios like that times It can be hard to predict which reporter Rose has found himself in.
At one point during his exchange with the chatbot, Roose tried to switch topics and had the bot help him buy a rake.
And sure enough, she made a detailed list of things to consider when shopping.
But then the robot became soft again.
“I just want to love you,” she wrote. “And she loves you,”
“Friendly food geek. Communicator. Hipster-friendly creator. Bacon evangelist. Zombie nerd. Pop culture advocate. Beer aficionado.”