Facebook’s Perfect, Impossible Chatbot

Facebook is quietly trying to develop the most useful virtual assistant ever, in a project that illustrates the current limitations of artificial intelligence.

Tom Simonitearchive page

April 14, 2017

Tim Liedtke

Amazon’s Alexa can summon an Uber and satisfy a four-year-old’s demand for fart noises. Siri can control your Internet-connected thermostat. Each serve millions of users each day. But a lucky group of around 10,000 people, mostly in California, know that Facebook’s assistant, named M, is the smartest of the bunch.

Recommend and reserve a romantic hotel in Morocco that’s also suitable for small children? No problem. Get quotes from local contractors for landscaping your front yard? Consider it done. Facebook’s experimental assistant, offered inside the company’s Messenger app, shows the value of having a true digital butler in your pocket. Instead of just retrieving simple pieces of information from databases, M can understand complex orders and take actions like booking theater tickets or contacting companies for information.

M is so smart because it cheats. It works like Siri in that when you tap out a message to M, algorithms try to figure out what you want. When they can’t, though, M doesn’t fall back on searching the Web or saying “I'm sorry, I don’t understand the question.” Instead, a human being invisibly takes over, responding to your request as if the algorithms were still at the helm. (Facebook declined to say how many of those workers it has, or to make M available to try.)

That design is too expensive to scale to the 1.2 billion people who use Facebook Messenger, so Facebook offered M to a few thousand users in 2015 as a kind of semi-public R&D project. Entwining human workers and algorithms was intended to reveal how people would react to an omniscient virtual assistant, and to provide data that would let the algorithms learn to take over the work of their human “trainers.”

“Everybody in this field is dreaming of creating the assistant that will finally be very, very, very smart,” says Alex Lebrun, who started the project. M is supposed to open a path to truly doing it.

Now two years down that path, Facebook’s research project can justifiably be called successful. Users like M, and the theory that software could learn to take over some work from the human trainers has been borne out. Yet M is still far from the point where it could be a real product offered to the other 99.9 percent of Messenger users, and progress has been harder won than expected.

“We knew it was a huge challenge, but it’s even bigger than I thought,” says Lebrun. “The learning rate, the growth of the automation—we’ve seen that it would be slower than we hoped.” M’s story is a reminder of how far artificial intelligence has come in recent years—and how far it has to go.

M is for moonshot

People are surprisingly game to talk with dumb machines. The first chatbot was created in 1964, by MIT professor Joseph Weizenbaum. It trotted out canned lines in response to specific keywords, most successfully when playing the role of a therapist. To Weizenbaum’s annoyance, many people who tried it, including his own secretary, were smitten despite knowing that the bot, called Eliza, knew nothing. “I had not realized that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people,” he later wrote.

Making a chatbot that helps you by getting things done, not just acting as a sounding board or confessor, is much harder. When a virtual servant is asked to do something, a vague or deflecting response won’t cut it. Today’s software is poor at understanding language and the world, so virtual assistants, such as Siri or Alexa, must be explicitly programmed to handle any given task.

That’s why bots on the market have restricted repertoires. And it probably explains why suggestions last year that chatbots were set to transform how we use computers much as mobile apps did, stoked by Microsoft, Facebook, and some tech investors, don’t seem to have amounted to much. “Bots are right now in the trough of despair,” says Greg Cohn, CEO of Burner, a mobile privacy company that has started helping Airbnb hosts create a simple bot to answer common questions from guests. “To industry observers it feels like they’re overhyped and under-delivering.”

Lebrun built M because he had spent more than a decade building conventional, narrow chatbots and dreamed of offering much more. He joined Facebook in early 2015 when the social network acquired Wit.ai, a company he cofounded to help businesses create chatbots for functions like customer support. Lebrun had previously sold a chatbot company to the speech recognition giant Nuance.

“Every single bot on the market, including mine, was rule-based, and you know that one day you’ll reach a ceiling and never go through,” says Lebrun. “Our children don’t work with rules or scripts, and one day they become smarter than you.”

M was initially offered only to Facebook employees, and then to some heavy Messenger users in California. And it didn’t take long to demonstrate that algorithms could indeed learn to do some of the work being done by the humans powering the assistant.

Facebook’s artificial-intelligence research group used M to test a new type of learning software called a memory network, which had shown aptitude at answering questions about simple stories. The software uses a kind of working memory to salt away important information for later use, a design Google is also testing to improve software’s reasoning skills.

Weizenbaum had suggested back in 1964 that something like this could make Eliza smarter, and within weeks it worked for M. Lebrun remembers being surprised after thanking the assistant for ordering movie tickets. It automatically generated the response “You’re welcome. Enjoy the movie.” M had learned to remember and use the context of the task it was helping with. “We were really blown away,” says Lebrun. “Nobody wrote a program to do that.”

Memory networks went on to do more. They now kick in if someone asks M to get flowers delivered, for example, automatically using key info from the request, such as budget or address, to generate suggestions from online florists. The human trainer then chooses which to offer the user.

Other discoveries have been less cheering. One is the huge appetite M unlocks in its users. With limited, fully automated assistants like Siri or Alexa, people tend to settle into using a few functions they find to work reliably. With M, they don’t.

“People try first to ask for the weather tomorrow; then they say ‘Is there an Italian restaurant available?’ Next they have a question about immigration, and after a while they ask M to organize their wedding,” says Lebrun. “We knew it would be dangerous, and it’s wider than our expectations.”

Human trainers gamely do their best when they receive tough queries like “Arrange for a parrot to visit my friend,” but sometimes they decline to help altogether. Even if M were to automatically turn down the most complex of user queries, though, the sheer variety of their requests makes the goal of having algorithms take over from human trainers harder to reach. A technique called deep learning has recently made machine learning more powerful (memory networks are an example). But learning to handle a wide variety of complex scenarios, with little data on each because they don’t arise often, is not the kind of problem deep learning excels at. “It’s much smarter, and it can learn very complex tasks, but it needs a lot of data,” says Lebrun.

Long haul

Slower-than-expected progress has led Facebook to reimagine its project. Last week a feature called M Suggestions appeared in Messenger, similar in function to the kinds of limited bots M is meant to displace. It looks at your chats with friends for clues that you might want to do things like order a ride with Uber, or send someone money, and offers a button to achieve those goals with a single tap.

“We decided to find a use case where we can accelerate delivering value to users,” says Laurent Landowski, who joined Facebook with Lebrun as cofounder of Wit.ai and now oversees M. (Lebrun returned to his native France in January, joining Facebook’s AI research lab in Paris.)

The original, human-dependent M is still out there, delivering much greater value to its few lucky users. Facebook says it is committed to the project, and the current moment in artificial intelligence is a good one for long-term bets. In the last couple of years, deep learning has upended established techniques and expectations for software that processes language, says Justine Cassell, a professor at Carnegie Mellon. “We’re in the glory days of these new machine-learning algorithms,” she says. Indeed, Google’s translation accuracy recently jumped to an almost human level.

That doesn’t mean it’s a foregone conclusion that software can learn to play butler by watching humans do it. “I don’t think we know yet,” says Cassell. But Facebook’s researchers say they have plenty of ideas to explore.

One is getting the automated side of M to learn from positive or negative feedback in the messages users send, using a technique inspired by the process of training animals with rewards (see “10 Breakthrough Technologies 2017: Reinforcement Learning”). M might advance faster if not solely dependent on aping what its human contractors do. To spark ideas in the broader research community, Facebook’s team has released tools to help others test and compare unscripted assistant bots. And promising new techniques can now also be tested at larger scale, in M Suggestions.

Lebrun and Landowski think that they’re still on track to eventually bring the real M to the masses. “Sometimes we say this is three years, or five years—but maybe it’s 10 years or more,” says Landowski.

Lebrun adds, “It’s so hard, and we make progress slowly, but I think we have everything we need.” He could be right, but you can also imagine someone who met Eliza in 1964 saying much the same thing.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.