Skip to Content
Uncategorized

Speak Easy

IBM aims to solve speech recognition’s nagging problems.
May 1, 2002

The idea of computers that accurately understand human speech has both enticed and frustrated engineers. But now, IBM Research in Yorktown Heights, NY, is undertaking a multiyear project to finally solve all the problems that have kept voice recognition systems from comprehending free-form conversations-and becoming mainstream technology.

IBM aims to create a system that understands perhaps 20 languages, including medical and legal terms, with about 98 percent accuracy-a big improvement over the 80 to 85 percent accuracy of IBM’s own speech recognition products and those from firms such as Peabody, MA-based ScanSoft. Troubles with accuracy are largely to blame for the limited market for speech recognition, which has so far been relegated mainly to dictation and telephone-based automated-response applications. IBM also hopes to overcome the other limitations of current systems: the need for hours of training, quiet surroundings and steady voice inflections. By making voice recognition more accurate and more broadly applicable, IBM believes it could open markets in real-time transcription for business meetings and new voice interfaces for handheld computers, or for search engines that could retrieve sound bites from audio databases of news broadcasts and speeches.

In current speech recognition technology, algorithms compare the waveform, an electronic representation of a word, to a master waveform database to develop a short list of possible matches, then select the most commonly used word on that list. IBM is exploring ways to make better matches, including new algorithms that make guesses based on the context of the conversation. IBM researchers have also built a lip-reading video system that reduces errors by one-third, says David Nahamoo, group manager of Human Language Technologies for IBM Research. “We’re combining audio and visual features together, which we’re feeding into our recognition engines,” he says. “We’re learning how to use one to clean up the other.”

Some experts are skeptical. Real-time meeting transcription is still “lab stuff right now,” says Steve McClure, vice president at technology market researcher IDC in Framingham, MA. “I’ve seen IBM demos work fine one time, and another time the damn application wouldn’t work at all.” Nahamoo concedes the initiative needs years of work and lots of luck to reach its goals. But given the speech recognition industry’s history of failing to deliver on its promises, Big Blue’s newest push could provide a few words of encouragement to the struggling technology.

Keep Reading

Most Popular

Russian servicemen take part in a military drills
Russian servicemen take part in a military drills

How a Russian cyberwar in Ukraine could ripple out globally

Soldiers and tanks may care about national borders. Cyber doesn't.

Death and Jeff Bezos
Death and Jeff Bezos

Meet Altos Labs, Silicon Valley’s latest wild bet on living forever

Funders of a deep-pocketed new "rejuvenation" startup are said to include Jeff Bezos and Yuri Milner.

conceptual illustration showing various women's faces being scanned
conceptual illustration showing various women's faces being scanned

A horrifying new AI app swaps women into porn videos with a click

Deepfake researchers have long feared the day this would arrive.

ai learning to multitask concept
ai learning to multitask concept

Meta’s new learning algorithm can teach AI to multi-task

The single technique for teaching neural networks multiple skills is a step towards general-purpose AI.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.