Computer vision has been having a moment. No more does an image recognition algorithm make dumb mistakes when looking at the world: these days, it can accurately tell you that an image contains a cat. But the way it pulls off the party trick may not be as familiar to humans as we thought.
Most computer vision systems identify features in images using neural networks, which are inspired by our own biology and are very similar in their architecture—only here, the biological sensing and neurons are swapped out for mathematical functions. Now a study by researchers at Facebook and Virginia Tech says that despite those similarities, we should be careful in assuming that both work in the same way.
To see exactly what was happening as both humans and AI analyzed an image, the researchers studied where the two focused their attention. Both were provided with blurred images and asked questions about what was happening in the picture—“Where is the cat?” for instance. Parts of the image could be selectively sharpened, one at a time, and both human and AI did so until they could answer the question. The team repeated the tests using several different algorithms.
Obviously they could both provide answers—but the interesting result is how they did so. On a scale of 1 to -1, where 1 is total agreement and -1 total disagreement, two humans scored on average 0.63 in terms of where they focused their attention across the image. With a human and an AI, the average dropped to 0.26.
In other words: the AI and human were both looking at the same image, both being asked the same question, both getting it right—but using different visual features to arrive at those same conclusions.
This is an explicit result about a phenomenon that researchers had already hinted at. In 2014, a team from Cornell University and the University of Wyoming showed that it was possible to create images that fool AI into seeing something, simply by creating a picture made up of the strong visual features that the software had come to associate with an object. Humans have a large pool of common-sense knowledge to draw on, which means they don’t get caught out by such tricks. That's something researchers are trying to incorporate into a new breed of intelligent software that understands the semantic visual world.
But just because computers don’t use the same approach doesn’t necessarily mean they’re inferior. In fact, they may be better off ignoring the human approach altogether.
The kinds of neural networks used in computer vision usually employ a technique known as supervised learning to work out what’s happening in an image. Ultimately, their ability to associate a complex combination of patterns, textures, and shapes with the name of an object is made possible by providing the AI with a training set of images whose contents have already been labeled by a human.
But teams at Facebook and Google’s DeepMind have been experimenting with unsupervised learning systems that ingest content from video and images to learn what human faces and everyday objects look like, without any human intervention. Magic Pony, recently bought by Twitter, also shuns supervised learning, instead learning to recognize statistical patterns in images to teach itself what edges, textures, and other features should look like.
In these cases, it’s perhaps even less likely that the knowledge of the AI will be generated through a process aping that of a human. Once inspired by human brains, AI may beat us by simply being itself.
10 Breakthrough Technologies 2024
Every year, we look for promising technologies poised to have a real impact on the world. Here are the advances that we think matter most right now.
Scientists are finding signals of long covid in blood. They could lead to new treatments.
Faults in a certain part of the immune system might be at the root of some long covid cases, new research suggests.
AI for everything: 10 Breakthrough Technologies 2024
Generative AI tools like ChatGPT reached mass adoption in record time, and reset the course of an entire industry.
What’s next for AI in 2024
Our writers look at the four hot trends to watch out for this year
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.