Skip to Content
Artificial intelligence

Machines Can Now Recognize Something After Seeing It Once

Algorithms usually need thousands of examples to learn something. Researchers at Google DeepMind found a way around that.
November 3, 2016

Most of us can recognize an object after seeing it once or twice. But the algorithms that power computer vision and voice recognition need thousands of examples to become familiar with each new image or word.

Researchers at Google DeepMind now have a way around this. They made a few clever tweaks to a deep-learning algorithm that allows it to recognize objects in images and other things from a single example—something known as "one-shot learning." The team demonstrated the trick on a large database of tagged images, as well as on handwriting and language.

The best algorithms can recognize things reliably, but their need for data makes building them time-consuming and expensive. An algorithm trained to spot cars on the road, for instance, needs to ingest many thousands of examples to work reliably in a driverless car. Gathering so much data is often impractical—a robot that needs to navigate an unfamiliar home, for instance, can’t spend countless hours wandering around learning.

Oriol Vinyals, a research scientist at Google DeepMind, a U.K.-based subsidiary of Alphabet that’s focused on artificial intelligence, added a memory component to a deep-learning system—a type of large neural network that’s trained to recognize things by adjusting the sensitivity of many layers of interconnected components roughly analogous to the neurons in a brain. Such systems need to see lots of images to fine-tune the connections between these virtual neurons.

The team demonstrated the capabilities of the system on a database of labeled photographs called ImageNet. The software still needs to analyze several hundred categories of images, but after that it can learn to recognize new objects—say, a dog—from just one picture. It effectively learns to recognize the characteristics in images that make them unique. The algorithm was able to recognize images of dogs with an accuracy close to that of a conventional data-hungry system after seeing just one example. 

Vinyals says the work could be especially useful if it could quickly recognize the meaning of a new word. This could be important for Google, Vinyals says, since it could allow a system to quickly learn the meaning of a new search term.

Others have developed one-shot learning systems, but these are usually not compatible with deep-learning systems. An academic project last year used probabilistic programming techniques to enable this kind of very efficient learning (see "This Algorithm Learns Tasks As Fast As We Do").

But deep-learning systems are becoming more capable, especially with the addition of memory mechanisms. Another group at Google DeepMind recently developed a network with a flexible kind of memory, making it capable of performing simple reasoning tasks—for example, learning how to navigate a subway system after analyzing several much simpler network diagrams (see "What Happens When You Give a Computer a Working Memory?").

"I think this is a very interesting approach, providing a novel way of doing one-shot learning on such large-scale data sets," says Sang Wan Lee, who leads the Laboratory for Brain and Machine Intelligence at the Korean Advanced Institute for Science and Technology in Daejeon, South Korea. "This is a technical contribution to the AI community, which is something that computer vision researchers might fully appreciate."

Others are more skeptical about its usefulness, given how different it still is from human learning. For one thing, says Sam Gershman, an assistant professor in Harvard's Department for Brain Science, humans generally learn by understanding the components that make up an image, which may require some real-world, or commonsense, knowledge. For example, "a Segway might look very different from a bicycle or motorcycle, but it can be composed from the same parts."

According to both Gershman and Wan Lee, it will be some time yet before machines match human learning. "We still remain far from revealing humans’ secret of performing one-shot learning," Wan Lee says, "but this proposal clearly poses new challenges that merit further study."

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

Providing the right products at the right time with machine learning

Amid shifting customer needs, CPG enterprises look to machine learning to bolster their data strategy, says global head of MLOps and platforms at Kraft Heinz Company, Jorge Balestra.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.