AI Is Learning to Pick Out Voices from a Crowd’s Chatter
Current voice recognition systems are pretty good—if only one person speaks. But as we’ve said before, understanding voices among more people, which is often known as the cocktail party problem, is tough—even for firms like Amazon, which has amassed gobs of data via its Alexa smart assistant platform.
Now, though, a team of researchers from Mitsubishi Electric Research Laboratory has developed a trick to identify features in a voice that can be used to track a single person in conversation. According to New Scientist, by chopping up audio and identifying how clusters of those features occur over time, it’s possible to trace a voice even in the din of a crowd.
How good is it? Well, results published on the arXiv suggest it can track a single person in conversation even when five people are talking, and can isolate a single voice from two others with 80 percent accuracy. So, not perfect. But it’s a big step toward having Alexa understand you when you ask it to play your new jam over the hubbub of your friends at a dinner party.
Keep Reading
Most Popular
DeepMind’s cofounder: Generative AI is just a phase. What’s next is interactive AI.
“This is a profound moment in the history of technology,” says Mustafa Suleyman.
What to know about this autumn’s covid vaccines
New variants will pose a challenge, but early signs suggest the shots will still boost antibody responses.
Human-plus-AI solutions mitigate security threats
With the right human oversight, emerging technologies like artificial intelligence can help keep business and customer data secure
Next slide, please: A brief history of the corporate presentation
From million-dollar slide shows to Steve Jobs’s introduction of the iPhone, a bit of show business never hurt plain old business.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.