Skip to Content
Artificial intelligence

A computer program that learns to “imagine” the world shows how AI can think more like us

DeepMind’s advance could lead to machines that can make better sense of a scene.
June 14, 2018
Deepmind

Machines will need to get a lot better at making sense of the world on their own if they are ever going to become truly intelligent.

DeepMind, the AI-focused subsidiary of Alphabet, has taken a step in that direction by making a computer program that builds a mental picture of the world all by itself. You might say that it learns to imagine the world around it.

The system, which uses what DeepMind’s researchers call a generative query network (GQN), looks at a scene from several angles and can then describe what it would look like from another angle.

This might seem trivial, but it requires a relatively sophisticated ability to learn about the physical world. In contrast to many AI vision systems, the DeepMind program makes sense of a scene more the way a person does. Even if something is partly occluded, for example, it can reason about what’s there.

Eventually, such technology might help serve as the foundation for deeper artificial intelligence, letting machines describe and reason about the world with much greater sophistication.

Ali Eslami, a research scientist at DeepMind, and his colleagues tested the approach on three virtual settings: a block-like tabletop, a virtual robot arm, and a simple maze. The system uses two neural networks; one learns and another generates, or “imagines,” new perspectives. The system captures aspects of a scene, including object shapes, positions, and colors, using a vector representation, which makes it relatively efficient. The research appears in the journal Science today.

The work is something of a new direction for DeepMind, which has made its name by developing programs capable of performing remarkable feats, including learning how to play the complex and abstract board game Go. The new project builds upon other academic research that seeks to mimic human perception and intelligence using similar computational tools.

“It is an interesting and valuable step in the right direction,” says Josh Tenenbaum, a professor who leads the Computational Cognitive Science group at MIT.

Tenenbaum says the ability to deal with complex scenes in a modular way is impressive but adds that the approach shows the same limitations as other machine-learning methods, including a need for a huge amount of training data: “The jury is still out on how much of the problem this solves.”

Sam Gershman, who heads the Computational Cognitive Neuroscience Lab at Harvard, says the DeepMind work combines some important ideas about how human visual perception works. But he notes that, like other AI programs, it is somewhat narrow, in that it can answer only a single query: what would a scene look like from a different viewpoint?

“In contrast, humans can answer an infinite variety of queries about a scene,” Gershman says. “What would a scene look like if I moved the blue circle a bit to the left, or repainted the red triangle, or squashed the yellow cube?

Gershman says it’s unclear whether DeepMind’s approach could be adapted to answer more complex questions or whether some fundamentally different approach might be required.

Deep Dive

Artificial intelligence

conceptual illustration showing various women's faces being scanned
conceptual illustration showing various women's faces being scanned

A horrifying new AI app swaps women into porn videos with a click

Deepfake researchers have long feared the day this would arrive.

storm front
storm front

DeepMind’s AI predicts almost exactly when and where it’s going to rain

The firm worked with UK weather forecasters to create a model that was better at making short term predictions than existing systems.

People are hiring out their faces to become deepfake-style marketing clones

AI-powered characters based on real people can star in thousands of videos and say anything, in any language.

Tentacle of Octopus
Tentacle of Octopus

What an octopus’s mind can teach us about AI’s ultimate mystery

Machine consciousness has been debated since Turing—and dismissed for being unscientific. Yet it still clouds our thinking about AIs like GPT-3.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.