Current voice recognition systems are pretty good—if only one person speaks. But as we’ve said before, understanding voices among more people, which is often known as the cocktail party problem, is tough—even for firms like Amazon, which has amassed gobs of data via its Alexa smart assistant platform.
Now, though, a team of researchers from Mitsubishi Electric Research Laboratory has developed a trick to identify features in a voice that can be used to track a single person in conversation. According to New Scientist, by chopping up audio and identifying how clusters of those features occur over time, it’s possible to trace a voice even in the din of a crowd.
How good is it? Well, results published on the arXiv suggest it can track a single person in conversation even when five people are talking, and can isolate a single voice from two others with 80 percent accuracy. So, not perfect. But it’s a big step toward having Alexa understand you when you ask it to play your new jam over the hubbub of your friends at a dinner party.
Here’s how a Twitter engineer says it will break in the coming weeks
One insider says the company’s current staffing isn’t able to sustain the platform.
Technology that lets us “speak” to our dead relatives has arrived. Are we ready?
Digital clones of the people we love could forever change how we grieve.
How to befriend a crow
I watched a bunch of crows on TikTok and now I'm trying to connect with some local birds.
Starlink signals can be reverse-engineered to work like GPS—whether SpaceX likes it or not
Elon said no thanks to using his mega-constellation for navigation. Researchers went ahead anyway.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.