|
June 2003 Computers That Speak Your LanguageVoice recognition that finally holds up its end of a conversation is revolutionizing customer service. Now the goal is to make natural language the way to find any type of information, anywhere. By Wade Roush
"IwanttoflyfromBostontoMilwaukeenext Say that to a human airline agent nicely, and he or she will quickly disentangle your words and find flights that meet your criteria. Say it to the airline's automated reservations line, however, and all you're likely to get is a cheery digital voice intoning, "Sorry, I didn't catch that." Don't blame the voice. Even assuming the airline's computers overcame the garbled words, background noise, and Boston accent to render the request into accurate text, no language-processing system has the computational firepower to make sense of your price and routing constraints, ignore irrelevancies like the fact that Saturday is your sister's birthday, and understand that if the party starts at 3:00 p.m., you're not interested in flights that arrive in Milwaukee at 4:00. If computers could understand and respond to such routine natural-language requests, the results would be win-win: airlines wouldn't need to hire so many agents, and consumers wouldn't have to struggle with the confusion of touch-tone interfaces that leave them furiously tapping the "0" button, vainly trying to reach a live operator. Futurists have been envisioning such a world since at least 1968, when 2001: A Space Odyssey's HAL 9000 became the archetypal voice-interactive computer. Academic and corporate researchers intrigued by the sheer coolness of the idea have been tinkering for just as long with systems for recognizing and responding to human speech. But technologies don't take hold because they're cool: they need a business imperative. For language processing, it's the enormous expense of live customer service that's finally driving the technologies out of the lab. Simple "press or say one'" phone trees are rapidly heading for the scrap heap as companies such as Nuance Communications and SpeechWorks meld previously competing strategies into software that infers the intention behind people's naturally spoken or written requests. Major airlines, banks, and consumer-goods companies are already using the systems, and while the technology can't yet hold up its end of a conversation, it does help callers with simple questions avoid long queues-and frees human agents to deal with more complex requests. Such improvements have set up natural-language systems for explosive growth: 43 percent of North American companies have either purchased interactive voice response software for their call centers or are conducting pilot studies, according to Forrester Research, a technology analysis firm. As more companies replace their old touch-tone phone menus, today's $500 million market for telephone-based speech applications will grow-reaching $3.5 billion by 2007, according to Steve McClure, a vice president in the software research group at market analysis firm IDC. In late 2002, for example, Bell Canada installed a $4.5 million voice response system built by Menlo Park, CA-based Nuance. "Based on the results we're seeing, the actual return on investment will take only about 10 months," says Belinda Banks, Bell Canada's associate director of customer care. Overall, the company expects to save $5.3 million in customer service costs this year alone. And this is only phase one in the deployment of language-processing systems. Companies like Nuance and Boston's SpeechWorks, the two market leaders in interactive voice response systems, are succeeding partly because they've tailored their technologies for narrow domains-such as travel information-where the vocabularies and concepts they must master are restricted. Even as such systems take over the customer service niche, other companies are still pursuing the challenge of true natural-language understanding. If research efforts at IBM and the Palo Alto Research Center (PARC), for example, bear fruit, computers may soon be able to interpret almost any conversation, or to retrieve almost any information a Web user wants, even if it's locked away in a video file or a foreign language-opening markets wherever people seek knowledge via computer networks. Predicts IDC's McClure, "Whereas the GUI [graphical user interface] was the interface for the 1990s, the NUI, or natural' user interface, will be the interface for this decade." |
Talk to the Phone
08/21/2007










Comments