Skip to Content

Talk to the Phone

Speech-recognition software from Vlingo could make the mobile Web easier to use.
August 21, 2007

Mobile phones can do lots of things: search the Web, download music, send e‑mail. But the vast ­majori­ty of the 233 million Americans who own them never use them for more than calls and short text messages. One reason is that other features often require users to enter sentences or long search terms, a tedious task.

Voice-recognition correction: Using Vlingo’s voice-recognition interface, a user can speak a sentence into a cell phone, see that sentence (in this case the text message “Hey Andy how’s it going”) appear on a screen, and use simple editing tools to replace words that may have been misinterpreted by voice-recognition technology. The interface helps users avoid manual entry of text messages, search terms, and e-mails.

Speech-recognition interfaces could make such features easier to use. Vlingo, a startup in Cambridge, MA, is coming to market with a ­simple user interface that provides speech recognition across mobile-phone applications. “We are not developing the core speech-recognition engine,” says cofounder Michael ­Phillips, a former MIT research scientist and founder of SpeechWorks, which developed call-center speech interfaces for clients including Amtrak. “We don’t need to do that again.” Instead, Vlingo takes speech, turns it into text, and provides a simple way to correct errors using the phone’s navigation keys, helping the system “learn.” The user’s spoken words travel over a mobile Internet connection for analysis on Vlingo’s server, sparing the phone the heavy computational work; the transcription appears less than two seconds later.

As a test, I asked the phone for “Schumann Piano Concerto.” Vlingo came back quickly with “Sean Piano Concerto.” When I hovered the cursor over the word “Sean,” the system offered alternatives like “shine” and “sign.” If one of them had been right, I could have clicked to insert it as a replacement. But since the right word didn’t appear, I typed it in manually.

My correction upped the chances of better results in the future. I had taught the system that the next time I use a word that sounds like Schumann, “Schumann” should be one of my optional transcriptions. I also taught it that other people conducting music searches might use the word “Schumann”–so it might start popping up for them, too.

Company: Vlingo, Cambridge, MArnFunding: $6.5 million from Charles River Ventures and Sigma PartnersrnFounders: Michael Phillips, SpeechWorks founder and former MIT research scientist; John Nguyen, former SpeechWorks computer scientist

  • CEO:

    David Grannan, former manager of mobile e-mail at Nokia

“Small platforms need speech, and search is a powerful way to find information,” says James Glass, head of the spoken-language systems group at MIT’s Computer Science and Artificial Intelligence Laboratory. “The combination of the two is very powerful,” he says, adding that Vlingo is working at that frontier.

Vlingo wants mobile-phone carriers to bundle its interface with other offerings. “Carriers may be happy to give it away, because they will generate revenue as people actually use navigation systems or surf the Web,” says CEO David Grannan.

Mazin Gilbert, executive director of natural­-language processing at AT&T Labs in Florham Park, NJ, says others, including AT&T, are also developing speech interfaces for mobile phones; he thinks one problem will be “providing the right user experience in a cost-­effective, scalable way.” Vlingo thinks a simple, adaptable interface is one way to make growth easy.

Keep Reading

Most Popular

Rendering of Waterfront Toronto project
Rendering of Waterfront Toronto project

Toronto wants to kill the smart city forever

The city wants to get right what Sidewalk Labs got so wrong.

windows desktop with anime image from Wallpaper Engine
windows desktop with anime image from Wallpaper Engine

Chinese gamers are using a Steam wallpaper app to get porn past the censors

Wallpaper Engine has become a haven for ingenious Chinese users who use it to smuggle adult content as desktop wallpaper. But how long can it last?

Yann LeCun
Yann LeCun

Yann LeCun has a bold new vision for the future of AI

One of the godfathers of deep learning pulls together old ideas to sketch out a fresh path for AI, but raises as many questions as he answers.

Linux hack concept
Linux hack concept

The US military wants to understand the most important software on Earth

Open-source code runs on every computer on the planet—and keeps America’s critical infrastructure going. DARPA is worried about how well it can be trusted

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.