Skip to Content
Uncategorized

Speak Easy

IBM aims to solve speech recognition’s nagging problems.
May 1, 2002

The idea of computers that accurately understand human speech has both enticed and frustrated engineers. But now, IBM Research in Yorktown Heights, NY, is undertaking a multiyear project to finally solve all the problems that have kept voice recognition systems from comprehending free-form conversations-and becoming mainstream technology.

IBM aims to create a system that understands perhaps 20 languages, including medical and legal terms, with about 98 percent accuracy-a big improvement over the 80 to 85 percent accuracy of IBM’s own speech recognition products and those from firms such as Peabody, MA-based ScanSoft. Troubles with accuracy are largely to blame for the limited market for speech recognition, which has so far been relegated mainly to dictation and telephone-based automated-response applications. IBM also hopes to overcome the other limitations of current systems: the need for hours of training, quiet surroundings and steady voice inflections. By making voice recognition more accurate and more broadly applicable, IBM believes it could open markets in real-time transcription for business meetings and new voice interfaces for handheld computers, or for search engines that could retrieve sound bites from audio databases of news broadcasts and speeches.

In current speech recognition technology, algorithms compare the waveform, an electronic representation of a word, to a master waveform database to develop a short list of possible matches, then select the most commonly used word on that list. IBM is exploring ways to make better matches, including new algorithms that make guesses based on the context of the conversation. IBM researchers have also built a lip-reading video system that reduces errors by one-third, says David Nahamoo, group manager of Human Language Technologies for IBM Research. “We’re combining audio and visual features together, which we’re feeding into our recognition engines,” he says. “We’re learning how to use one to clean up the other.”

Some experts are skeptical. Real-time meeting transcription is still “lab stuff right now,” says Steve McClure, vice president at technology market researcher IDC in Framingham, MA. “I’ve seen IBM demos work fine one time, and another time the damn application wouldn’t work at all.” Nahamoo concedes the initiative needs years of work and lots of luck to reach its goals. But given the speech recognition industry’s history of failing to deliver on its promises, Big Blue’s newest push could provide a few words of encouragement to the struggling technology.

Deep Dive

Uncategorized

Our best illustrations of 2022

Our artists’ thought-provoking, playful creations bring our stories to life, often saying more with an image than words ever could.

How CRISPR is making farmed animals bigger, stronger, and healthier

These gene-edited fish, pigs, and other animals could soon be on the menu.

The Download: the Saudi sci-fi megacity, and sleeping babies’ brains

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. These exclusive satellite images show Saudi Arabia’s sci-fi megacity is well underway In early 2021, Crown Prince Mohammed bin Salman of Saudi Arabia announced The Line: a “civilizational revolution” that would house up…

10 Breakthrough Technologies 2023

Every year, we pick the 10 technologies that matter the most right now. We look for advances that will have a big impact on our lives and break down why they matter.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.