Hello,

We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

A View from Paul Boutin

The Search for a Clearer Voice

How Google’s Voice Search is getting so good.

  • January 10, 2011

Smart phones are great at a lot of things, with one exception: Typing on a touch screen or a downsized keyboard is still frustrating compared to a full-size computer keyboard. That’s probably why Google says that, even before the release of its new personalized Voice Search app for Android in mid-December, one in four mobile searches were already input by voice rather than from a keyboard.

Credit: Google

The improved Voice Search takes speech recognition to its next level: Google’s servers will now log up to two years of your voice commands in order to more precisely parse exactly what you’re saying.

In tests on the new app, which appeared in Google’s Android Market a week before Christmas, the app originally got about three out of five searches correct. After a few days, the ratio crept up to four out of five. It’s surprisingly good at searches that involve common nouns (“heathen child lyrics”) and what search experts call vertical searches for popular topics like airline flights and movie listings. Voice Search knows “United Flight 714” and “True Grit show times 90066” when it hears them. Less successful are searches involving people’s names. In repeated attempts to Google up WikiLeaks founder Julian Assange, Voice Search got no closer than “wikileaks founder julian of songs.”

How does it work? Rather than try to use the phone itself to do speech recognition, Voice Search digitizes the user’s input commands and sends them off to Google’s gargantuan server farms. There, the spoken words are broken down and compared both to statistical models of what words other people mean when they utter those syllables, plus a history of the user’s own voice commands, through which Google refines its matching algorithm for that particular voice. The app recognizes five different flavors of English—American, British, Australian, Indian and South African—plus Afrikaans, Cantonese, Czech, Dutch, French, German, Italian, Japanese, Korean, Mandarin, Polish, Portugese, Russian, Spanish, Turkish, and Zulu.

The tricky part—and the motive for a personalized search app—is that human voices vary wildly between men and women, between young people and old people, and among those with various accents and dialects. By storing hundreds, perhaps thousands of what speech recognition experts call “utterances” by the same person over months of use, Voice Search can better guess at what that particular person is saying.

That mathematical model used to recognize phrases was refined over three years using voice samples from Google’s now-defunct GOOG-411 automated directory assistance service, which the company operated from 2007 through late last year specifically to capture a wide-ranging set of voice samples for analysis. The company’s first Voice Search app, for iPhone only, was launched a year after GOOG-411 in November 2008.

Voice Search doubles as a spoken-command system for the phone. As shown in this video, it understands commands such as, “Send mail to Mike LeBeau. How’s life in New York treating you? The weather’s beautiful here.” The app will find LeBeau in your contacts—it’s better at matching names here than in a Web search, because it’s working with a limited set—and will fill in the subject line with your first sentence. You can speak additional text into the message, or edit it with the phone’s keyboard, before sending it.

Google has clearly put a lot of effort into its speech recognition technology. But the impact on it bottom line is obvious: By removing the aggravation of typing on tiny keys, the company hopes to get customers to reach for its search and e-mail services much more often.

Couldn't make it to Cambridge? We've brought EmTech MIT to you!

Watch session videos

Uh oh–you've read all of your free articles for this month.

Insider Premium
$179.95/yr US PRICE

Want more award-winning journalism? Subscribe and become an Insider.
  • Insider Plus {! insider.prices.plus !}* Best Value

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    What's Included

    Unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Bimonthly print magazine (6 issues per year)

    Bimonthly digital/PDF edition

    Access to the magazine PDF archive—thousands of articles going back to 1899 at your fingertips

    Special interest publications

    Discount to MIT Technology Review events

    Special discounts to select partner offerings

    Ad-free web experience

  • Insider Basic {! insider.prices.basic !}*

    {! insider.display.menuOptionsLabel !}

    Six issues of our award winning print magazine, unlimited online access plus The Download with the top tech stories delivered daily to your inbox.

    See details+

    What's Included

    Unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Bimonthly print magazine (6 issues per year)

  • Insider Online Only {! insider.prices.online !}*

    {! insider.display.menuOptionsLabel !}

    Unlimited online access including articles and video, plus The Download with the top tech stories delivered daily to your inbox.

    See details+

    What's Included

    Unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

/
You've read all of your free articles this month. This is your last free article this month. You've read of free articles this month. or  for unlimited online access.