We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Phones Pick Up Language

Faster chips and better software help mobile devices recognize speech.

Cell phones and wireless PDAs have one perennial problem: either no keyboard or a very small one. That makes typing anything more than a phone number a tedious, fumbling task. But a solution is on the way: mobile devices that are adept at recognizing spoken language.

Some cell phones already use speech recognition as an alternative to keypad entry for simple tasks such as dialing a number, but someday soon you may also find yourself dictating a text message into your phone, asking your car for directions, or telling your MP3 player that you want to listen to the Beatles. Indeed, today’s high-end cell phones are capable of running sophisticated speech recognition software that could eventually mean the end of pecking at keyboards. “The fundamental problem of inputting information into mobile devices is the interface, and voice overcomes that,” says Rich Geruson, CEO of VoiceSignal, a speech technology company based in Woburn, MA.

While companies like IBM and Dragon Systems (now part of Peabody, MA-based ScanSoft) have been selling desktop speech recognition software for more than a decade, mobile devices with even limited speech recognition abilities appeared only several years ago. And until now, such devices have largely been “speaker dependent” – meaning they work well only for their principal users and have to be trained to recognize individual words.

This story is part of our October 2004 Issue
See the rest of the issue

Faster processors and more efficient software, however, are enabling new speaker-independent systems that can recognize the speech of any user and require no training. These systems can discern thousands, rather than dozens, of names and are designed to work even when the speaker is in a noisy environment, such as the front seat of a speeding car.

For the engineers at VoiceSignal, the key to this advance was a shift in focus from accuracy to efficiency. The highly accurate speech recognition algorithms designed for desktop computers are too complex to run on mobile devices. Traditional algorithms for mobile devices required less processing power, but because they worked by matching the sound wave of an entire word to a sound wave stored in a device’s memory, they were limited to a small vocabulary.

Instead of storing an entire sound wave for each word in its lexicon, VoiceSignal’s new system stores information about phonemes – the smallest units of recognizable speech. Every phoneme can be described according to a set of acoustic parameters, such as pitch. The software measures a user’s utterances along these parameters and then looks for words that match. Parameter values take up less memory than audio files, so the software can handle a larger vocabulary without requiring any additional storage space.

And that’s opening up applications beyond simple voice dialing. For example, VoiceSignal offers software that lets users jump to any node of a cell phone’s menu with a single utterance. “If you try to send a [text] message on your phone right now, you have to do about ten clicks just to get to the message space,” says Geruson. “With our technology, you just say, ‘Send message to John Smith’s mobile,’ and your cursor is flashing and ready to go.” The first phone with this capability was released in August. Within six months the company also plans to release software for phones that lets users dictate text messages and e-mails, which Geruson anticipates will be particularly useful in Asia. “If you think it’s hard to input into a keyboard in Western or Latin languages, think about the problem in Japan or China, where you have thousands of characters,” he says.

Researchers at ScanSoft, meanwhile, are putting speech recognition to use in cars. A car kit with a built-in microphone, speakerphone, and ScanSoft speech engine provides motorists with a hands-free interface for their cell phones. A phone equipped with a Bluetooth wireless transmitter can be placed anywhere in a car, and drivers can use voice commands to dial, accept or reject calls, adjust the volume, and control menu options, all without taking their hands off the wheel or their eyes off the road.

While the wireless industry has been the first to embrace speech recognition, makers of consumer electronics appear to be close behind. At the Mitsubishi Electric Research Laboratories in Cambridge, MA, researchers are developing software that enlists speech to simplify the task of searching for information. Rather than scrolling through 10,000 MP3 songs on a handheld device, for instance, a user could select a single song just by saying its name – or that of a band or album. “We decided one of the things [speech] was good at was choosing,” says Mitsubishi speech technology researcher Peter Wolf.

Despite these advances, however, it remains to be seen how widely speech recognition will be adopted. Phone users may feel uncomfortable dictating personal e-mails in public. And they may always want keyboards for entering sensitive information such as credit card numbers. But Geruson predicts that the technology will eventually transform the way people use mobile devices. As a few early adopters take to the technology, he says, “it will catch on, and then it will be everywhere.”

The AI revolution is here. Will you lead or follow?
Join us at EmTech Digital 2019.

Register now
Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    Print + Digital Magazine (6 bi-monthly issues)

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

    Technology Review PDF magazine archive, including articles, images, and covers dating back to 1899

    10% Discount to MIT Technology Review events and MIT Press

    Ad-free website experience

You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.