MIT Technology Review Subscribe

A New Trick Can Spoof a Speech Recognition AI Every Time

Given an audio waveform, researchers can now produce a virtually identical version that makes speech-recognition software transcribe something else entirely.

Backstory: Adversarial examples have fooled plenty of computer-vision algorithms. While all neural networks are susceptible to such attacks, researchers have had less success with audio. Previous attacks were only able to make subtle tweaks to what the software hears.

Advertisement

What’s new: Berkeley researchers showed that they can take a waveform and add a layer of noise that fools DeepSpeech, a state-of-the-art speech-to-text AI, every time. The technique can make music sound like arbitrary speech to the AI, or obscure voices so they aren’t transcribed.

This story is only available to subscribers.

Don’t settle for half the story.
Get paywall-free access to technology news for the here and now.

Subscribe now Already a subscriber? Sign in
You’ve read all your free stories.

MIT Technology Review provides an intelligent and independent filter for the flood of information about technology.

Subscribe now Already a subscriber? Sign in

Brace for annoyance: Imagine playing a music video from YouTube on your speakers and having Alexa “hear” an order for two tons of creamed corn. Welcome to AI attack hell.

This is your last free story.
Sign in Subscribe now

Your daily newsletter about what’s up in emerging technology from MIT Technology Review.

Please, enter a valid email.
Privacy Policy
Submitting...
There was an error submitting the request.
Thanks for signing up!

Our most popular stories

Advertisement