Sending Clippy Back to School
The deployment of A.I. in applications like robotic search and rescue is at an early stage, but that’s not true on other fronts. One of the busiest artificial-intelligence labs today is at Microsoft Research, where almost everything is aimed at conjuring up real-world applications.
Here several teams under the direction of Eric Horvitz, senior researcher and manager of the Adaptive Systems and Interaction group, are working to improve the embedded functions of Microsoft products. Several of the group’s A.I.-related advances have found their way into Windows XP, the latest iteration of Microsoft’s flagship operating system, including a natural-language search assistant called Web Companion and “smart tags,” a feature that automatically turns certain words and phrases into clickable Web links and entices readers to explore related sites.
To demonstrate where things are heading, Horvitz fires up the latest in “smart” office platforms. It’s a system that analyzes a user’s e-mails, phone calls (wireless and land line), Web pages, news clips, stock quotes-all the free-floating informational bric-a-brac of a busy personal and professional lifestyle-and assigns every piece a priority based on the user’s preferences and observed behavior. As Horvitz describes the system, it can perform a linguistic analysis of a message text, judge the sender-recipient relationship by examining an organizational chart and recall the urgency of the recipient’s responses to previous messages from the same sender. To this it might add information gathered by watching the user by video camera or scrutinizing his or her calendar. At the system’s heart is a Bayesian statistical model-capable of evaluating hundreds of user-related factors linked by probabilities, causes and effects in a vast web of contingent outcomes-that infers the likelihood that a given decision on the software’s part will lead to the user’s desired outcome. The ultimate goal is to judge when the user can safely be interrupted, with what kind of message, and via which device.
Horvitz expects that such capabilities-to be built into coming generations of Microsoft’s Office software-will help workers become more efficient by freeing them from low-priority distractions such as spam e-mail, or scheduling meetings automatically, without the usual rounds of phone tag. That will be a big step forward from Clippy, the animated paper clip that first appeared as an office assistant in Microsoft Office 97, marking Microsoft’s first commercial deployment of Bayesian models. Horvitz says his group learned from Clippy and other intelligent assistants-which were derided as annoyingly intrusive-that A.I.-powered assistants need to be much more sensitive to the user’s context, environment and goals. The less they know about a user, Horvitz notes, the fewer assumptions they should make about his or her needs, to avoid distracting the person with unnecessary or even misguided suggestions.
Another ambitious project in Horvitz’s group aims to achieve better speech recognition. His team is building DeepListener, a program that assembles clues from auditory and visual sensors to clarify the ambiguities in human speech that trip up conventional programs. For instance, by noticing whether a user’s eyes are focused on the computer screen or elsewhere, DeepListener can decide whether a spoken phrase is directed at the system or simply part of a human conversation (see “Natural Language Processing,” TR January/February 2001).
In a controlled environment, the system turns in an impressive performance. When ambient noise makes recognition harder, DeepListener tends to freeze up or make wild guesses. But Horvitz’s group is developing algorithms that will enable the software to behave more as hard-of-hearing humans do-for example, by asking users to repeat or clarify phrases, or providing a set of possible meanings and asking for a choice, or sampling homonyms in search of a close fit. But this work veers toward the very limits of A.I. “Twenty-five years from now,” Horvitz acknowledges, “speech recognition will still be a problem.”