Hello,

We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

David Zax

A View from David Zax

Can We Make Machines Listen More Carefully?

A new project aims to create far smarter voice recognition software.

  • July 19, 2011

You probably use voice recognition technology already, if in a limited capacity. Maybe you use Google’s voice-activated search, or take advantage of its (somewhat wonky) voice-mail transcriptions in Google Voice. At the office, maybe you use Dragon dictation software. Even if these programs worked perfectly, though (which they don’t) they would still leave something to be desired. Voice recognition software today works in very specialized circumstances—it can typically recognize only one voice at a time, and it performs best when it has reams of data in the archive before tackling a new speech sample.

What if we had voice recognition technology that didn’t have so many strictures? What if we had software that was quick and nimble, able to discern one speaker from another on the fly? In other words, what if voice recognition technology was more like the way voice recognition actually works in the real world, in the human brain?

A coalition of three British Universities—the Universities of Cambridge, Sheffield, and Edinburgh—is working to bring us what they call “natural speech technology.” Google and Dragon are (relatively) good at what they do, Thomas Hain of Sheffield recently told The Engineer. “But where it’s about natural speech—people having a normal conversation—these applications still have very poor performance.”

With nearly $10 million of funding from Britain’s Engineering and Physical Sciences Research Council, the team has set itself four main technical objectives.

First, they want to make speech software that’s smart–that can learn and adapt on the fly. They intend to build models and algorithms that can “adapt to new scenarios and speaking styles, and seamlessly adapt to new situations and contexts almost instantaneously,” the team members write.

Second, they want those models and algorithms to be smart enough to eavesdrop on a meeting, and to be able to sift “who spoke what, when, and how”—in other words, they want speech software as adept as a great human stenographer. Then, looking forward, the team’s third and fourth goals are to create technologies building on their models: speech synthesizers (for sufferers of stroke or neurodegenerative diseases) that learn from data and that are “capable of generating the full expressive diversity of natural speech”; and various other applications. These are as yet vaguely defined, but which might include something the team calls “personal listeners.”

It’s very ambitious stuff, enough to make you pause and consider a future in which speech recognition is ubiquitous, seamless, and orders of magnitude more useful than it is today. Some of the researchers are already at work on some applications; Hain’s award-winning team is collaborating with the BBC to transcribe its back catalog of audio and video footage.

Get stories like this before anyone else with First Look.

Subscribe today
Already a Premium subscriber? Log in.

Uh oh–you've read all of your free articles for this month.

Insider Premium
$179.95/yr US PRICE

More from Intelligent Machines

Artificial intelligence and robots are transforming how we work and live.

Want more award-winning journalism? Subscribe and become an Insider.
  • Insider Plus {! insider.prices.plus !}* Best Value

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    What's Included

    Unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Bimonthly print magazine (6 issues per year)

    Bimonthly digital/PDF edition

    Access to the magazine PDF archive—thousands of articles going back to 1899 at your fingertips

    Special interest publications

    Discount to MIT Technology Review events

    Special discounts to select partner offerings

    Ad-free web experience

  • Insider Basic {! insider.prices.basic !}*

    {! insider.display.menuOptionsLabel !}

    Six issues of our award winning print magazine, unlimited online access plus The Download with the top tech stories delivered daily to your inbox.

    See details+

    What's Included

    Unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Bimonthly print magazine (6 issues per year)

  • Insider Online Only {! insider.prices.online !}*

    {! insider.display.menuOptionsLabel !}

    Unlimited online access including articles and video, plus The Download with the top tech stories delivered daily to your inbox.

    See details+

    What's Included

    Unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

/
You've read all of your free articles this month. This is your last free article this month. You've read of free articles this month. or  for unlimited online access.