Artificial intelligence can accurately identify objects in an image or recognize words uttered by a human, but its algorithms don’t work the same way as the human brain—and that means that they can be spoofed in ways that humans can’t.
New Scientist reports that researchers from Bar-Ilan University in Israel and Facebook’s AI team have shown that it’s possible to subtly tweak audio clips so that a human understands them as normal but a voice-recognition AI hears something totally different. The approach works by adding a quiet layer of noise to a sound clip that contains distinctive patterns a neural network will associate with other words.
The team applied its new algorithm, called Houdini, to a series of sound clips, which it then ran through Google Voice to have them transcribed. An example of an original sound clip read:
Her bearing was graceful and animated she led her son by the hand and before her walked two maids with wax lights and silver candlesticks.
When that original was passed through Google Voice it was transcribed as:
The bearing was graceful an animated she let her son by the hand and before he walks two maids with wax lights and silver candlesticks.
But the hijacked version, which via listening tests was confirmed to be indistinguishable to human ears from to the original, was transcribed as:
Mary was grateful then admitted she let her son before the walks to Mays would like slice furnace filter count six.
The team’s efforts can also be applied to other machine-learning algorithms. Tweaking images of people, it’s possible to confuse an algorithm designed to spot a human pose into thinking that a person is actually assuming a different stance, as in the image above. And by adding noise to an image of a road scene, the team was able to fool an AI algorithm usually used in autonomous-car applications for classifying features like roads and signs to instead see ... a minion. Those image-based results are similar to research published last year by researchers from the machine learning outfits OpenAI and Google Brain.
These so-called adversarial examples may seem like a strange area of research, but they can be used to stress-test machine-learning algorithms. More worrying, they could also be used nefariously, to trick AIs into seeing or hearing things that aren’t really there—convincing autonomous cars to see fake traffic on a road, or a smart speaker to hear false commands, for example. Of course, actually implementing such attacks in the wild is rather different from running them in a lab, not least because injecting the data is tricky.
What’s perhaps most interesting about all this is that finding a way to protect AIs from these kinds of tricks is actually quite difficult. As we’ve explained in the past, we don’t truly understand the inner workings of deep neural networks, and that means that we don’t know why they’re receptive to such subtle features in a voice clip or image. Until we do, adversarial examples will remain, well, adversarial for AI algorithms.
This new data poisoning tool lets artists fight back against generative AI
The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models.
The Biggest Questions: What is death?
New neuroscience is challenging our understanding of the dying process—bringing opportunities for the living.
Rogue superintelligence and merging with machines: Inside the mind of OpenAI’s chief scientist
An exclusive conversation with Ilya Sutskever on his fears for the future of AI and why they’ve made him change the focus of his life’s work.
How to fix the internet
If we want online discourse to improve, we need to move beyond the big platforms.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.