AI Is Learning to Pick Out Voices from a Crowd’s Chatter
Current voice recognition systems are pretty good—if only one person speaks. But as we’ve said before, understanding voices among more people, which is often known as the cocktail party problem, is tough—even for firms like Amazon, which has amassed gobs of data via its Alexa smart assistant platform.
Now, though, a team of researchers from Mitsubishi Electric Research Laboratory has developed a trick to identify features in a voice that can be used to track a single person in conversation. According to New Scientist, by chopping up audio and identifying how clusters of those features occur over time, it’s possible to trace a voice even in the din of a crowd.
How good is it? Well, results published on the arXiv suggest it can track a single person in conversation even when five people are talking, and can isolate a single voice from two others with 80 percent accuracy. So, not perfect. But it’s a big step toward having Alexa understand you when you ask it to play your new jam over the hubbub of your friends at a dinner party.
Keep Reading
Most Popular
Geoffrey Hinton tells us why he’s now scared of the tech he helped build
“I have suddenly switched my views on whether these things are going to be more intelligent than us.”
ChatGPT is going to change education, not destroy it
The narrative around cheating students doesn’t tell the whole story. Meet the teachers who think generative AI could actually make learning better.
Meet the people who use Notion to plan their whole lives
The workplace tool’s appeal extends far beyond organizing work projects. Many users find it’s just as useful for managing their free time.
Learning to code isn’t enough
Historically, learn-to-code efforts have provided opportunities for the few, but new efforts are aiming to be inclusive.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.