We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

A View from Paul Boutin

The Search for a Clearer Voice

How Google’s Voice Search is getting so good.

  • January 10, 2011

Smart phones are great at a lot of things, with one exception: Typing on a touch screen or a downsized keyboard is still frustrating compared to a full-size computer keyboard. That’s probably why Google says that, even before the release of its new personalized Voice Search app for Android in mid-December, one in four mobile searches were already input by voice rather than from a keyboard.

Credit: Google

The improved Voice Search takes speech recognition to its next level: Google’s servers will now log up to two years of your voice commands in order to more precisely parse exactly what you’re saying.

In tests on the new app, which appeared in Google’s Android Market a week before Christmas, the app originally got about three out of five searches correct. After a few days, the ratio crept up to four out of five. It’s surprisingly good at searches that involve common nouns (“heathen child lyrics”) and what search experts call vertical searches for popular topics like airline flights and movie listings. Voice Search knows “United Flight 714” and “True Grit show times 90066” when it hears them. Less successful are searches involving people’s names. In repeated attempts to Google up WikiLeaks founder Julian Assange, Voice Search got no closer than “wikileaks founder julian of songs.”

How does it work? Rather than try to use the phone itself to do speech recognition, Voice Search digitizes the user’s input commands and sends them off to Google’s gargantuan server farms. There, the spoken words are broken down and compared both to statistical models of what words other people mean when they utter those syllables, plus a history of the user’s own voice commands, through which Google refines its matching algorithm for that particular voice. The app recognizes five different flavors of English—American, British, Australian, Indian and South African—plus Afrikaans, Cantonese, Czech, Dutch, French, German, Italian, Japanese, Korean, Mandarin, Polish, Portugese, Russian, Spanish, Turkish, and Zulu.

The tricky part—and the motive for a personalized search app—is that human voices vary wildly between men and women, between young people and old people, and among those with various accents and dialects. By storing hundreds, perhaps thousands of what speech recognition experts call “utterances” by the same person over months of use, Voice Search can better guess at what that particular person is saying.

That mathematical model used to recognize phrases was refined over three years using voice samples from Google’s now-defunct GOOG-411 automated directory assistance service, which the company operated from 2007 through late last year specifically to capture a wide-ranging set of voice samples for analysis. The company’s first Voice Search app, for iPhone only, was launched a year after GOOG-411 in November 2008.

Voice Search doubles as a spoken-command system for the phone. As shown in this video, it understands commands such as, “Send mail to Mike LeBeau. How’s life in New York treating you? The weather’s beautiful here.” The app will find LeBeau in your contacts—it’s better at matching names here than in a Web search, because it’s working with a limited set—and will fill in the subject line with your first sentence. You can speak additional text into the message, or edit it with the phone’s keyboard, before sending it.

Google has clearly put a lot of effort into its speech recognition technology. But the impact on it bottom line is obvious: By removing the aggravation of typing on tiny keys, the company hopes to get customers to reach for its search and e-mail services much more often.

Keep up with the latest in AI at EmTech Digital.
Don't be left behind.

March 25-26, 2019
San Francisco, CA

Register now
Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    Print + Digital Magazine (6 bi-monthly issues)

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

    Technology Review PDF magazine archive, including articles, images, and covers dating back to 1899

    10% Discount to MIT Technology Review events and MIT Press

    Ad-free website experience

You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.