AI image recognition has made some stunning advances, but as new research shows, the systems can still be tripped up by examples that would never fool a person.
Labsix, a group of MIT students who recently tricked an image classifier developed by Google to think a 3-D-printed turtle was a rifle, released a paper on Wednesday that details a different technique that could fool systems even faster. This time, however, they managed to trick a “black box,” where they only had partial information on how the system was making decisions.
The team’s new algorithm starts with an image it wants to use to trick another system—in the example from their paper, it’s a dog—and then starts altering pixels to make the image look more like the source image; in this case, skiers. As it works, the “adversarial” algorithm challenges the image recognition system with versions of the picture that quickly move into territory any human would recognize as skiers (check out the gif, above). But all the while, the algorithm maintains just the right combination of sabotaged pixels to make the system think it’s looking at a dog.
Don’t settle for half the story.
Get paywall-free access to technology news for the here and now.