Alibaba has claimed a new record in AI language understanding

Will Knightarchive page

July 9, 2019

BooksUnsplash

An AI program developed by Alibaba has notched up a record-high score on a reading comprehension test. The result shows how machines are steadily improving at handling text and speech.

Getting better: The new record was set using the Microsoft Machine Reading Comprehension (MS MARCO) data set, which uses real questions that Bing users have asked in the past. The AI program had to read many web pages of information to be able to answer questions such as “What is a corporation?” (In this case the answer would be: “A corporation is a company or group of people authorized to act as a single entity and recognized as such in law.”) Its scores were close to or slightly better than humans’, according to two measures.

Bigger, better: AI algorithms have been improving at these sorts of question-and-answer tasks thanks to large, flexible learning algorithms and copious amounts of data. The Alibaba team developed a technique that essentially prunes out irrelevant text before trying to answer a question.

AI everywhere: Better language understanding helps Alibaba improve the chatbots that offer support to small retailers, says Lou Si, a VP at Alibaba’s DAMO academy, who led the team that developed the new algorithm. It can also make web search more natural. He adds that it will be a key part of the company’s cloud offerings and could even help break down language barriers between different businesses.

Better than us, though? The new program is not, however, “better at reading comprehension than humans.” It was simply able to answer some questions about a subset of text better than people, on average. It is still essentially doing statistical pattern recognition without comprehending the meaning of the words it sees.

“There is still a long journey ahead of us to having machines use language as freely as humans do,” says Li. “Most of the time machines will answer questions based on facts retrieved from the documents, but they lack reasoning skills ... That’s different from how humans use language.”

To have more stories like this delivered directly to your inbox, sign up for our Webby-nominated AI newsletter The Algorithm. It's free.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.