More and more people are using their computers for voice communication, such as Skype and audio instant messaging. For the most part, however, using these features requires one either to be tethered to her computer by a headset or to speak directly into a microphone and keep the speaker volume low, especially in shared office space.
In light of that problem, researchers at Microsoft are trying to make audio output more sophisticated. A team, led by Ivan Tashev, a software architect at Microsoft, recently began work on an algorithm that, in theory, will be able to direct sound from a set of speakers–ideally embedded in a computer monitor–into a person’s ears, effectively creating virtual headphones; just a few inches outside the focal point of the sound waves, the volume dramatically fades away. Crucially, says Tashev, his algorithm could be used by a wide range of inexpensive speakers that could be put into computer monitors.
The goal, he says, is to “target focused sound so that a person can walk around an office and hear” while on a video- or computer-aided audio conference call. Information about a person’s location could be collected by hardware peripherals and fed back into the speaker software, allowing the virtual headphones to move with the user in real time. For example, Tashev says, a camera, either mounted on or embedded in a computer monitor, and image-processing software could determine a person’s position. In addition, an array of four or more microphones on or near a computer monitor could be programmed to localize sound by measuring the subtle time differences among when sound arrives at each speaker in the array. In fact, Tashev’s previous work has been to design such sound-localizing algorithms for the types of microphones that are commonly found in the bezel of laptop computers. Employing both a camera and a microphone can improve the accuracy and distance a person could roam while using the speakers.
To be sure, the idea of focusing sound isn’t new: military radar systems and common ultrasound equipment, used to image fetuses in utero and find cancerous tumors, have done this for years. The technology is called beamforming, and it is achieved when the sound waves from certain speakers in an array experience microsecond delays, explains Jiashu Chen, technical manager at Finisar Corporation, a data-communications company based in Sunnyvale, CA. The delayed sound waves combine in such a way that in some parts of space, the sound is canceled out, and in others, the sound grows louder.
However, beamforming systems that direct audible sound, such as music or human voices, are more technically challenging to build than radar and ultrasound are, says Chen, because they must accommodate a wider range of frequencies; lower frequencies require different hardware and software considerations than higher frequencies do. Signal-processing technology has improved to the point that some commercial products use beamforming. Yamaha, for example, sells speakers for home entertainment that bounce focused sound off walls to create virtual speakers behind a listener’s head. But such systems are still rare, and always pricey.