Teaching a computer to play Go at a superhuman level is cool, but not especially useful for you or me. But what if a computer could read a few dozen pages of text, like the manual for a new microwave, and then answer questions about how it works? Sign me up.
Reading and comprehending text is incredibly difficult for computers, but a Canadian company called Maluuba has made progress with an algorithm that can read text and answer questions about it with impressive accuracy. Most importantly, unlike other approaches, it works with just small amounts of text. It might eventually help computers “comprehend” documents.
Researchers from Maluuba posted a paper describing their latest progress last week. It describes an algorithm capable of reading several hundred children’s stories combined with questions and answers about each text. After training, the algorithm could correctly answer multiple-choice questions about an unfamiliar text with more than 70 percent accuracy. The researchers also tested the algorithm on the text of Harry Potter and the Philosopher’s Stone and found that it could answer questions about that text with similar accuracy.
Beyond academic advances, Maluuba hopes to eventually create a system that can take care of mundane reading on your behalf. “We’re interested in use cases like user manuals, patient records, or customer service documents,” says Mohamed Musbah, vice president of product for the company, which is based in Waterloo, Canada. “In those areas, you really don’t have a glut of data.”
Maluuba’s team used a popular neural-network learning approach known as deep learning to train its system. But the researchers designed their network to consider text at different levels of abstraction—from words to phrases to sentences—and they also prepared the network to be good at learning in this way before training. Usually deep-learning networks are configured randomly before training. This allowed the network to learn very quickly, and resulted in question-answering 15 percent better than had been achieved before using a deep-learning approach. It was also 2 percent better than the best hand-coded solution.
“On the face of the numbers, it’s a big jump,” says Yoshua Bengio, a professor at the University of Montreal and scientific advisor to Maluuba. But Bengio, who is one of a handful of deep-learning gurus now working with companies on commercial AI efforts, cautions that it will take a while for experts to parse the significance of the approach.
The idea of teaching machines to read and communicate effectively using language is certainly tantalizing. It could open up powerful new ways to interact with computers and mine information. But comprehending text is one of the biggest challenges in artificial intelligence; computers are usually tripped up by the fact that language requires a deep understanding of the way the real world works.
Despite the challenges, some of the biggest tech companies are trying to develop AIs that can comprehend text. Facebook is gathering conversational data through an assistant service called M in an effort to train its algorithms to converse naturally (see “Teaching Machines to Understand Us”). Google DeepMind, a subsidiary of Alphabet that is focused on AI research, is doing similar work, training deep-learning systems to read summaries of news articles (see “Google DeepMind Teaches Artificial Intelligence Machines to Read”).
As yet, however, there have been no grand breakthroughs, and it’s unclear how hard it may be to equip machines with sophisticated reading comprehension skills. Researchers are making progress largely by tweaking and improving key machine-learning techniques and feeding computers large quantities of annotated text.
The kind of machine-learning approach employed by the Maluuba researchers normally requires huge swaths of text in order to learn. Indeed, the amount of text required to make deep learning work has often been held up as one of its key limiting factors (see “Can This Man Make AI More Human?”). A fundamental challenge with language is that the words used to represent different concepts are arbitrary, so it is more difficult to draw connections between them than it is for images.
Maluuba, started by several graduates of the University of Waterloo in 2010, previously developed an intelligent personal assistant for smartphones, and it has focused its research on natural-language processing or machine comprehension.
“I think it's certainly a step forward,” says Richard Socher, cofounder of an AI company called MetaMind, which is also working on language processing. “It's a very well-engineered system that combines well-understood and established traditional natural-language processing features with ideas from neural networks.”
Chris Dyer, a researcher at Carnegie Mellon University who specializes in natural-language processing, agrees that Maluuba’s results are impressive, but believes that machines will need to gain a genuine understanding of the world in order to converse properly—as opposed to an ability to draw statistical conclusions from text. This will likely mean going beyond learning purely from annotated text.
“Computers are too limited in terms of their perception and understanding of the world,” Dyer says.