Technology Review - Published By MIT
Advertisement
« Back 1 [2]

June 2005

From The Lab: Information Technology

Continued from page 1

By Monya Baker (edit)

smaller text tool iconmedium text tool iconlarger text tool icon

Debabbling
Isolating speech signals in audio recordings

Context: Microphones placed around a meeting room tend to yield recordings where voices overlap and are hard to distinguish. When there are at least as many microphones present as people talking, computer algorithms have been able to isolate the audio of each speaker. But if fewer microphones are used, these methods don't work, and problems of voice overlapping can persist. Alternative methods require creating a profile of each speaker's voice from previous recordings or making certain assumptions about the audio signals. Now Francis Bach and Michael Jordan of the University of California, Berkeley, have developed an algorithm that separates the voices of multiple speakers in recordings made with just one microphone, without requiring strong prior assumptions or speaker profiles.

Methods and Results: Bach and Jordan's algorithm homes in on the voice characteristics that are most likely to vary among people. The recorded sounds are laid out in a spectrogram, which shows the intensity of sound of various frequencies over time in a two-dimensional graph. Bach and Jordan's algorithm automatically divides up the spectrogram among the speakers; it assumes that parts of the spectrogram are likely to be from the same speaker if they are near each other on the graph, vary similarly over time, or are alike in pitch and timbre. The algorithm is trained on samples in which separately recorded voices have been mixed; based on the training, the algorithm assigns a relative importance to each characteristic -- say, timbre or tempo. Then the algorithm applies this training to new recordings. So far, the authors have been able to separate the overlapping voices in several recordings of pairs of speakers. Although the separation is not perfect, both speakers are more intelligible.

Why it Matters: Historians, journalists, lawyers, and other professionals rely on recorded conversations. These recordings are often made using a single microphone but feature multiple voices. By making babble more comprehensible, Bach and Jordan's algorithm promises to make such recordings more useful and easier to analyze. Hence, users may no longer be forced to haul around bulky, expensive equipment when recording important conversations and events.

Source: Bach, F. R., and M. I. Jordan. 2005. Blind one-microphone speech separation: a spectral learning approach. Advances in Neural Information Processing Systems 17 (in press).

« Back 1 [2]
June 2005

Would you like to read more articles from the June 2005 issue?

This article is from the June 2005 Issue of Technology Review. To read other articles from this issue simply register for My.TechnologyReview.com. It's free.

Subscribe today and save up to 41% »

Comments

Advertisement

Current Issue

Technology Review November/December 2008
Sun + Water = Fuel
An MIT chemist has opened the way to making hydrogen fuel from water using sunlight.
•  Subscribe
Save 41%
•  Table of Contents
•  MIT News

Magazine Services

Career Resources

MIT Technology Insider

Stories and breaking news from inside MIT about the latest research, innovations, and startups--in a convenient monthly e-newsletter. Subscribe today
Advertisement

Follow us on Twitter

Twitter

Get Technology Review updates via the web, cellphone, or Instant Messager – Follow techreview on Twitter!

Advertisement

More Technology News from Forbes

Advertisement
Advertisement
TECHNOLOGY RESOURCES
Advertisement
MIT Massachusetts Institute of Technology