Imagine you’re trying to have a catch-up with your best friend in the middle of a noisy pub. Despite the distracting background noise, you are able to filter out the hubbub and can still hear all your friend’s best gossip. This so-called “cocktail-party effect” comes naturally to many of us, but for people who use hearing aids, coping with irrelevant noises is difficult and deeply frustrating.
A potentially transformative new system, however, can work out who you want to listen to and amplify that voice. To understand the listener’s intention, it uses electrodes placed on the auditory cortex, the section of the brain (just inside the ear) that processes sounds. As the brain focuses on each voice, it generates a telltale electrical signature for each speaker.
A deep-learning algorithm that was trained to differentiate between different voices looks for the closest match between this signature and that of the various speakers in the room. It then amplifies the voice that matches best, helping the listener focus on the desired one.
The system, described in Science Advances, and created by a team led by researchers at Columbia University, was tested on three people without hearing loss who were undergoing surgery at North Shore University Hospital in New York. They had electrodes implanted as part of their treatment for epilepsy, meaning their brain signals could be monitored. The participants were played a tape of four different people speaking continuously. The researchers intermittently paused the recording and asked the subjects to repeat the last sentence before the pause, to ensure they were hearing it correctly. They were able to do so with an average accuracy of 91%.
There’s one obvious drawback: the current system involves brain surgery to implant the electrodes. However, the researchers say brain waves could be measured using sensors placed in or over the ear, meaning the system could eventually be embedded into a hearing aid (although this would be less accurate). It could also be used by people without hearing loss who want to boost their ability to focus on one voice.
Another difficulty is the time lag. It’s just a few seconds, but it could mean missing the start of someone’s sentence, says Nima Mesgarani, at Columbia University’s Neural Acoustic Processing Lab, who coauthored the paper. There’s an inherent tension between accuracy and speed at zeroing in on a specific speaker, he says—in other words, the longer the system has to listen, the more accurate it is. This issue requires further research to solve, but he says this sort of device could be commercially available in just five years.
This study is just a proof of concept, but it shows exciting potential, says Behtash Babadi, at the University of Maryland’s Department of Electrical and Computer Engineering, who was not involved in the research.
“Within just a few seconds, someone using a device like this could silence everyone but the person they want to hear,” he says. “This work is the first to really solve this problem, and it’s a leap toward making this solution a reality."
Humans and technology
Human-plus-AI solutions mitigate security threats
With the right human oversight, emerging technologies like artificial intelligence can help keep business and customer data secure
Merging physical and digital tools to build resilient supply chains
Using unique product identifiers and universal standards in the supply chain journey, the whole enterprise can unlock extended value
Unlocking the value of supply chain data across industries
How global standards and unique identifiers are turning supply chain data into a game-changer
Transformation requires companywide engagement
Employees need to be heard for leaders to overcome the hurdles of organizational change
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.