To Build a Smarter Chatbot, First Teach It a Second Language

Translation can help an algorithm’s overall language skills.

Will Knightarchive page

July 31, 2017

mr. tech

From Alexa and Siri to countless chatbots and automated customer support lines, computers are gradually learning to talk. The only trouble is they are still very easily confused.

A research team at Salesforce has come up with a clever way to improve the performance of many modern language programs—teaching an algorithm to speak another language before training it to do other tasks.

Teaching machines to hold a coherent conversation remains one of the big outstanding challenges in AI because untangling the meaning of spoken or written text so often relies on a broader understanding of the world, or commonsense knowledge (see “AI’s Language Problem”).

It turns out that training a machine-learning system to translate between two languages automatically teaches it useful things about the relationship and appropriate context of words. When this system is used as the foundation for another machine-learning system—one trained to hold a conversation, say, or to detect the sentiment in text—it performs far better than a system trained from scratch.

“We’re taking machine translation data, and we’re basically teaching the model how to understand words and context,” says Richard Socher, chief scientist at Salesforce and an expert on applying machine learning and language.

The work is one example of how machine-learning advances may help improve the language skills of AI systems. Many deep-learning-based computer vision systems make use of some form of network pre-training, and Socher suggests that machine translation may offer a similar way to bootstrap natural language systems.

Salesforce, an online platform for managing customer interactions across sales, marketing, and commerce, already offers a range of AI tools through its Einstein platform. These include a tool to automatically classify the sentiment of email or chat messages, and another to prioritize the leads a worker is pursuing based on his or her previous activity.

Socher believes this discovery will help improve the natural language capabilities of the Einstein platform. “For chatbots and automating customer support, this is super useful,” he says.

The Salesforce researchers trained a deep-learning system to translate between English and German. This involved feeding a large number of translated documents to a many-layered neural network, and tweaking the parameters of the network until it learned to produce a decent translation for itself. The system represents words using vectors, which is a common way to encode and parse meaning in text.

The researchers then trained the bilingual network to do a variety of things: determine the sentiment of a piece of text; classify different types of questions; and answer questions. And they show that their pre-trained network exceeded the performance of one that hasn’t learned a second language.

Machine translation data sets are particularly large, which helps with the machine-learning challenge. “There’s an important connection between translation and the rest of language,” says Bryan McCann, a researcher at Salesforce involved with the project. “[Translation data sets] are very general; they contain information that can be useful across the board for natural language processing.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.