How Computers Can Tell What They’re Looking At
Software has lately become much, much better at understanding images. Last year Microsoft and Google showed off systems more accurate than humans at recognizing objects in photos, as judged by the standard benchmark researchers use.
That became possible thanks to a technique called deep learning, which involves passing data through networks of roughly simulated neurons to train them to filter future data (see “Teaching Machines to Understand Us”). Deep learning is why you can search images stored in Google Photos using keywords, and why Facebook recognizes your friends in photos before you’ve tagged them. Using deep learning on images is also making robots and self-driving cars more practical, and it could revolutionize medicine.
That power and flexibility come from the way an artificial neural network can figure out which visual features to look for in images when provided with lots of labeled example photos. The neural networks used in deep learning are arranged into a hierarchy of layers that data passes through in sequence. During the training process, different layers in the network become specialized to identify different types of visual features. The type of neural network used on images, known as a convolutional net, was inspired by studies on the visual cortex of animals.
“These networks are a huge leap over traditional computer vision methods, since they learn directly from the data they are fed,” says Matthew Zeiler, CEO of Clarifai, which offers an image recognition service used by companies including BuzzFeed to organize and search photos and video. Programmers used to have to invent the math software needed to look for visual features, and the results weren’t good enough to build many useful products.
Zeiler developed a way to visualize the workings of neural networks as a grad student working with Rob Fergus at NYU. The images in the slideshow above take you inside a deep-learning network trained with 1.3 million photos for the standard image recognition test on which systems from Microsoft and others can now beat humans. It asks software to spot 1,000 different objects as diverse as mosquito nets and mosques. Each image shows visual features that most strongly activate neurons in one layer of the network.
Geoffrey Hinton tells us why he’s now scared of the tech he helped build
“I have suddenly switched my views on whether these things are going to be more intelligent than us.”
ChatGPT is going to change education, not destroy it
The narrative around cheating students doesn’t tell the whole story. Meet the teachers who think generative AI could actually make learning better.
Meet the people who use Notion to plan their whole lives
The workplace tool’s appeal extends far beyond organizing work projects. Many users find it’s just as useful for managing their free time.
Learning to code isn’t enough
Historically, learn-to-code efforts have provided opportunities for the few, but new efforts are aiming to be inclusive.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.