How to Fix Microsoft’s Offensive Chatbot Using Tips from Marvin Minsky and Improv Comedy

The notorious Tay would have been more agreeable if it had been able to tell when people were messing with it.

Kristian Hammondarchive page

March 30, 2016

Microsoft’s chatbot Tay was taken down quickly last week after it began tweeting offensive statements promulgating homophobia, racism, and anti-Semitism while also making attacks on specific individuals.

Yet three somewhat inappropriate terms are still circulating in relation to Tay: “AI,” “learning,” and “smarter.”

We can argue all day about the nature of AI and how close or far away various systems are to human reasoning, but it’s a conversation that just isn't relevant here. The system was designed to incorporate text from interactions with users into both personal responses and tweets. It is difficult to apply the word “learning” to that statistically driven process of recording, slicing, and reassembling strings of words, let alone to call the entire system AI.

The obvious fixes to Tay have already been suggested. The developers could give the system a list of terms to avoid, stop it from learning from a single user, prevent it from making public utterances based on newly learned material, or even have a human moderate what it says.

Yet none of those fine engineering fixes would actually address the root of the bot’s problem: Tay simply had no idea what it was saying. The fixes are all aimed at helping a system that is not all that bright get through a conversation without sounding downright evil. They are designed to make Tay appear smarter without actually making it smarter.

So what would it mean to make Tay seem smarter by actually making it smarter?

One approach can be seen in Siri and her sisters (they are usually branded as women) like Cortana. These systems recognize keywords associated with fairly narrow tasks and reel off programmatic replies. They have considerable power to take action and generate accurate responses within the areas they understand. But the Microsoft team behind Tay was clearly trying to make something more broadly conversational.

At the other end of the spectrum is the “AI complete” approach, in which the system never sees the light of day until it has access to all the knowledge required to fully understand everything that is said to it, everything that these utterances imply, and the full implications of its own statements in turn.

Software meeting those criteria would need a rich and complete representation of the world, a massive library of inference rules that run every time a new fact comes into view, and the ability to generate language based on every idea and combination of ideas it comes across or figures out independently. That’s a risky 20-year research project rather than a product road map.

There is, however, a middle ground between just slicing and dicing sentences and trying to make something that can understand all of creation. It involves giving the system just a tiny bit of knowledge—enough to recognize when a user is trying to game the system. You don’t have to know everything. You just have to know when someone is messing with you.

Given that Microsoft’s release notes for Tay said the team included “improvisational comedians,” I am surprised this wasn’t the route taken.

I personally have taken enough suggestions from audiences to know that at every bachelorette party, a drunken young woman in the back will scream out obscene suggestions that she will regret for the next two years. When that happens, you have two choices: either respect the suggestion and work it into a scene without making it disruptive, or politely say “The first thing I heard from the front was ‘mailbox’” and move on. The important thing is to be able to know when it is happening and have a plan of action to deal with it.

Nothing neutralizes a bully as well as being called out. My guess is that if Tay pointed out that it knew it was being played in one-on-one interactions and provided attribution for newly learned “facts” when using them in public tweets, the shaming effect would have been enough to shut down even the nastiest attacks.

Great conversationalists don’t know everything. They know conversation. And a system that knows nothing of the world but knows how to interact and not offend would be brilliant. But to react to the game, you have to recognize when it is being played.

The late Marvin Minsky stated, “A little bit of semantics goes a long way.” Systems like Tay don’t have to know everything about the world, but they do need to know what they are doing and what users are going to do to them. The crucial little bit Tay was missing was the ability to know when it was being played and the ability to respond in kind.

I hope Tay comes back online soon. Even more, I hope it does a Reddit AMA where it explains the source of each and every one of its tweets and the lessons it learned from being bullied in the schoolyard that is the Internet.

Kristian Hammond is a professor of computer science at Northwestern University and chief scientist and cofounder of Narrative Science, which offers software that automatically generates written reports from data. For reasons that no one quite understands, he has spent about 20 years on stage doing improv with people far more talented than he is.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.