As part of a research collaboration with MIT computer scientists, the Nokia Research Center Cambridge, in Cambridge MA, is developing cell phones that can understand and respond to written commands typed in English.
Robert Iannucci, head of Nokia’s research centers, says the company wants to transform phones from simple calling terminals to “information gateways” – to the Internet, GPS and sensors, MP3s, desktop computers, iPods, and other devices. And, he says, that requires rethinking the entire interface between people and handhelds. For both Nokia and MIT, that means using text interaction.
“Humans are good with language,” says Boris Katz, lead research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory, the principle group working with Nokia. “We want language to be a first-rate citizen” on cell phones, he says.
Natural language navigation systems have been long on promise and short on delivery. But it’s no longer unrealistic to think these systems may be in the hands – and handsets – of consumers in the near future. One caveat: the complex underpinnings of these new applications and the algorithms that parse language will have to be hidden from cell-phone users – lest they get frustrated navigating through layers of menus.
To power Nokia’s natural language technology, MIT’s Katz is using a software system he developed in 1993 called Start, which interprets human questions and finds answers using websites such as the Internet Movie Database (IMDB) and Mapquest. Using the Web version of Start as a base, Katz is currently working with the Nokia center to develop a mobile version of the software for cell phones, called MobileStart.
Here’s how the Web version of Start works: users type a question into a text field. The software interprets the query, decides where to seek the answer (in its database or on another website), and responds with a written explanation, a link to a website, or an image.
[click here for examples of text processing on a cell phone.]
“Start extracts answers, not hits,” says Katz, because it interprets human language, rather than looking for keywords, like Google and other search engines.
The Start system understands English sentences by breaking them down into a series of relationships between object, property, and value. For instance, if one types, “What is the population of Iraq?”, Start interprets the query: the object is Iraq, the property is population, and the value is what Start seeks.
That’s straightforward; however, people tend to ask more complex questions, particularly if they’re looking for specific information. If a person asks a question such as, “How many people live in the capital of the third-largest country in Asia?” Start will break it down into three separate queries to process one at a time: What is the third-largest country in Asia, what is that country’s capital, and what is the population of the capital? (Start decides how to break up questions and how to prioritize its evaluations using an algorithm Katz designed.)