Views from the Marketplace are paid for by advertisers and select partners of MIT Technology Review.
Context, Language, and Reasoning in AI: Three Key Challenges
The next phase in the AI revolution calls for advances in how the technology addresses and processes data from the non-vision world.
Today, artificial intelligence (AI) is rapidly emerging out of R&D labs and into the mainstream. Smart technologies are changing every aspect of our lives, from the way we work, to health care, education, travel, and transportation. One example: the self-driving cars produced by Google and Tesla. There are also many successful applications in the computer vision space.
But what about the non-vision applications of AI: that is, areas including non-spatial data—most importantly, text and numbers? The IBM Watson technology platform has famously beaten human chess grandmasters and a “Jeopardy” champion and is featured with celebrities in TV ads heralding the arrival of a smarter planet. Google’s AlphaGo recently beat a Korean grandmaster in an even more complex challenge, the ancient game of Go.
Does all that mean AI is finally here for non-vision applications? We believe the answer is an emphatic “yes”—but not with the current approaches used by IBM and Google.
Because of AI’s revolutionary potential, its applications in non-vision problems have attracted tremendous interest. There have also been attempts to replicate what worked with spatial data and apply it to text (and numbers). I’m referring to what seems like a blind rush of computational, statistically based approaches to process natural language. Such approaches attempt to turn text into data and then look for deep patterns in that data.
That situation reminds me of when physicists entered the financial market space and attempted to create predictive models for financial data. Such efforts are bound to fail, as has already happened to several companies. Eventually, the hype and illusion of applicability will wear off. Then we’ll address the problem by focusing on the fundamental characteristics of the data and devising an approach that is more conceptually sound.
Addressing a Trio of Challenges
AI technologies must overcome three challenges to be successful in the non-vision world (and perhaps even in the vision world): language, context, and reasoning.
A recent MIT Technology Review article, “AI’s Language Problem,” eloquently points out the first challenge. Today’s AI technologies, including those IBM Watson and Google AlphaGo, struggle to process language the way that humans do. That’s because the large majority of the current implementations approach text as data, not as language. They apply the same techniques that worked on spatial data to text.
The second challenge—understanding context—is related to the language problem, but is sufficiently significant that I think of it as an independent issue. Natural language text needs to be processed in the right context. The right context can only be developed if the technology focuses on the language structure, not just on the words in the text, as most current technologies seem to be doing, according to a 2014 article in IEEE Computational Intelligence Magazine. Then there’s the third challenge: the traceability of reasoning that the solution deploys to reach its conclusion.
Various technologies are attempting to address all three challenges today. Several successful enterprise AI solutions deal with language, context, and reasoning transparency effectively.
Handling Natural Language: From Processing to Understanding
Current methods for natural language processing (NLP) are largely driven by computational statistics. These methods don’t attempt to understand the text, but instead convert the text into data, then attempt to learn from patterns in that data. In the conversion process, we lose all context and meaning in the text. The assumption behind such approaches is clearly that, given sufficiently large collections of text, all possible permutations and combinations of meaning must be present. Thus, discovering word-based patterns should reveal the intelligence in the text, which can then be acted upon. Unfortunately, that outcome doesn’t occur in most real-world situations.
To address the language challenge in AI, we have to move from mechanically converting natural language to data through, for example, word occurrence-based logic. We can then understand the language by using its linguistic structure and the principles we have learned to express our thoughts. I view this as moving from NLP to Natural Language Understanding (NLU). In my view, NLP has come to symbolize the mechanical approach to natural language through conversion of text into data. Our real goal in AI is devising mechanisms for understanding the meaning of the written text.
A deep understanding of the linguistics structure in text would involve applying several principles from computational linguistics to decompose the text back into the concepts and verbiage used to connect them in the text. This is essentially reverse-engineering the text back to its fundamental ideas to understand how those ideas were connected together to form sentences and paragraphs.
RAGE AI has demonstrated such deep linguistic learning, and RAGE Frameworks has used this method to create and successfully deploy several AI applications in global corporations.
NLU also involves understanding the context in which the language is used. But understanding context involves multiple challenges.
First, in many languages, certain words can be used in multiple senses. That makes it important to eliminate the ambiguity of all such words so that their usage in a particular document can be accurately understood. Word-sense disambiguation is an ongoing issue in linguistics, but researchers have made significant progress toward addressing it.
Second, text documents often use domain-specific discourse models such as legal contracts, news articles, research reports, and the like. Certain properties of such domain discourse models should be incorporated into the AI technology to enhance NLU.
Third, we use many words as proxies in the document for other concepts. For example, most commonly, we say “Xerox” for “copy,” “FedEx” for “overnight courier,” and so on. AI technology must be able to recognize and understand these proxies.
Finally, the document may refer to knowledge that isn’t explicitly included of the text. We can understand it only if we have that prior knowledge.
AI has to create a repository of such global knowledge that can be retrieved, in context, to supplement the text in the document to gain full understanding of the meaning of the text. The Automated Knowledge Discoverer in RAGE AI is one such example of this idea, as I explain in more depth in my recent book, The Intelligent Enterprise in the Era of Big Data (John Wiley & Sons, 2016). This technology can automatically discover ideas related to a notion and expressions with various rhetorical relationships to the concept of interest.
For a period of time, such knowledge and global context may have to be refined by human experts. But in a short period, we have found it possible to create enough knowledge in the machine for it to perform at better than 90 percent recall. For instance, we created an AI application to categorize relevant content for a global consulting company across 20 of its practice areas. The idea was to provide, on a real-time basis, distilled knowledge to all of its consultants, using information gleaned from each practice area. Automated knowledge discovery was used to expand this to a more global understanding. Now, this application categorizes 40 million articles a month with greater than 90 percent accuracy through deep linguistic learning.
The final challenge we need to recognize is the visibility to the reasoning deployed by AI technology. Almost all AI technologies using computational statistics are black boxes. There’s nothing wrong with that per se—except that when we get a recommendation from the AI technology and it isn’t intuitive, we have no way of understanding it. We also don’t know if it is truly causal or spurious. We just have to blindly trust it.
Of course, there are applications where such visibility may not matter. For instance, in the example involving the game Go, it wasn’t important to understand the reasoning deployed by the machine for its moves. Another example: While we’d all prefer for Internet searches to be more relevant, the false positives don’t bother us too much.
On the other hand, we believe that for many applications, such visibility will be essential for adoption. In certain mission-critical applications where people are held accountable—such as medicine and business—users have to develop trust that the engine’s reasoning is sound. Visibility would also make it easier to improve the engine in the event of false positives or false negatives. With a black box, we have to find enough instances of false positives or false negatives to rebuild the black box. We’ll have no way of knowing whether all variations or permutations of that error have been addressed.
The good news: With the adoption of deep linguistic learning, we can maintain full and complete visibility to the reasoning.
Venkat Srinivasan is the founding CEO of RAGE Frameworks and a successful serial entrepreneur. He is also a former associate professor in the College of Business Administration at Northeastern University in Boston. He has published more than 30 articles in prestigious peer-reviewed journals and contributed to news publications such as The Wall Street Journal. He holds five patents in the area of knowledge-based automation and linguistics. He is the author of The Intelligent Enterprise in the Era of Big Data (John Wiley & Sons, 2016).
Keep up with the latest in artificial intelligence at EmTech MIT.
Discover where tech, business, and culture converge.
September 11-14, 2018
MIT Media Lab