Artificial intelligence is great at identifying objects in images, but it’s still pretty easy to mess it up. Add a few choice strokes or layer in some static noise invisible to the human eye, and you can throw off an image recognition system, sometimes to deadly effect. Adding stickers to a stop sign can make a self-driving car believe the sign is posting a 45-mile-per-hour speed limit, for example, while adding them to a road can make a Tesla swerve into the oncoming traffic lane. (On the bright side, the same techniques can also shield you from the surveillance state. You win some, you lose some.)
All of these are known as adversarial examples—and researchers are now scrambling to develop ways to protect AI systems from them. But in a paper last year, a group of researchers at Google Brain and Princeton, including one of the earliest researchers on this topic, Ian Goodfellow, argued that the emerging scholarship was too theoretical and missed the point.
While the bulk of the research focused on protecting systems from specially designed perturbations, a hacker, they said, would be likely to choose a blunter tool: a completely different photo rather than a noise pattern to layer on an existing one. This, too, could cause the system to misbehave.
The critique prompted Dan Hendrycks, a PhD student at the University of California, Berkeley, to compile a new image data set. He calls the images it contains “natural adversarial examples”—without any special tweaks, they fool a system anyway.
They include things like a squirrel that common systems mislabel as a sea lion, or a dragonfly that they misidentify as a manhole cover. “These examples appear much harder to defend against,” he says. Synthetic adversarial examples need to know all of the AI system’s defenses to be most effective. In contrast, natural examples can work pretty well even when those defenses change, he says.
Hendrycks released an early version of the data set, with around 6,000 images, last week at the International Conference on Machine Learning. He plans to release a final version with close to 8,000 in a couple of weeks. He intends the research community to use the data set as a benchmark.
In other words, rather than train image recognition systems directly on the images, they should reserve them only for testing. “If people were to just train on this data set, that’s just memorizing these examples,” he says. “That would be solving the data set but not the task of being robust to new examples.”
Cracking the logic behind the sometimes baffling errors the examples cause could lead to more resilient systems. “How is this confusing a dragonfly for guacamole?” Hendrycks jokes. “It’s not too clear why the mistake is being made at all.”
Artificial intelligence is creating a new colonial world order
An MIT Technology Review series investigates how AI is enriching a powerful few by dispossessing communities that have been dispossessed before.
Meta has built a massive new language AI—and it’s giving it away for free
Facebook’s parent company is inviting researchers to pore over and pick apart the flaws in its version of GPT-3
This horse-riding astronaut is a milestone in AI’s journey to make sense of the world
OpenAI’s latest picture-making AI is amazing—but raises questions about what we mean by intelligence.
How the AI industry profits from catastrophe
As the demand for data labeling exploded, an economic catastrophe turned Venezuela into ground zero for a new model of labor exploitation.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.