Hello,

We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Why we made this change

Visitors are allowed 3 free articles per month (without a subscription), and private browsing prevents us from counting how many stories you've read. We hope you understand, and consider subscribing for unlimited online access.

Back to MIT Technology Review home
Contact customer service if you are seeing this message in error.
  • Deepmind

    • Intelligent Machines

    A computer program that learns to “imagine” the world shows how AI can think more like us

    DeepMind’s advance could lead to machines that can make better sense of a scene.

    Machines will need to get a lot better at making sense of the world on their own if they are ever going to become truly intelligent.

    DeepMind, the AI-focused subsidiary of Alphabet, has taken a step in that direction by making a computer program that builds a mental picture of the world all by itself. You might say that it learns to imagine the world around it.

    The system, which uses what DeepMind’s researchers call a generative query network (GQN), looks at a scene from several angles and can then describe what it would look like from another angle.

    This might seem trivial, but it requires a relatively sophisticated ability to learn about the physical world. In contrast to many AI vision systems, the DeepMind program makes sense of a scene more the way a person does. Even if something is partly occluded, for example, it can reason about what’s there.

    Eventually, such technology might help serve as the foundation for deeper artificial intelligence, letting machines describe and reason about the world with much greater sophistication.

    Sign up for The Download
    Your daily dose of what's up in emerging technology

    By signing up you agree to receive email newsletters and notifications from MIT Technology Review. You can unsubscribe at any time. View our Privacy Policy for more details.

    Ali Eslami, a research scientist at DeepMind, and his colleagues tested the approach on three virtual settings: a block-like tabletop, a virtual robot arm, and a simple maze. The system uses two neural networks; one learns and another generates, or “imagines,” new perspectives. The system captures aspects of a scene, including object shapes, positions, and colors, using a vector representation, which makes it relatively efficient. The research appears in the journal Science today.

    Recommended for You
    1. Machine learning predicts World Cup winner
    2. EOS’s $4 billion crypto-democracy has just launched—and it’s probably going to be ruled by fat cats
    3. Cheap lidar gets a big win in deal with Volvo
    4. America’s new supercomputer beats China’s fastest machine to take title of world’s most powerful
    5. ZTE may have been saved, but its plight could strengthen China’s tech ambitions

    The work is something of a new direction for DeepMind, which has made its name by developing programs capable of performing remarkable feats, including learning how to play the complex and abstract board game Go. The new project builds upon other academic research that seeks to mimic human perception and intelligence using similar computational tools.

    “It is an interesting and valuable step in the right direction,” says Josh Tenenbaum, a professor who leads the Computational Cognitive Science group at MIT.

    Tenenbaum says the ability to deal with complex scenes in a modular way is impressive but adds that the approach shows the same limitations as other machine-learning methods, including a need for a huge amount of training data: “The jury is still out on how much of the problem this solves.”

    Sam Gershman, who heads the Computational Cognitive Neuroscience Lab at Harvard, says the DeepMind work combines some important ideas about how human visual perception works. But he notes that, like other AI programs, it is somewhat narrow, in that it can answer only a single query: what would a scene look like from a different viewpoint?

    “In contrast, humans can answer an infinite variety of queries about a scene,” Gershman says. “What would a scene look like if I moved the blue circle a bit to the left, or repainted the red triangle, or squashed the yellow cube?

    Gershman says it’s unclear whether DeepMind’s approach could be adapted to answer more complex questions or whether some fundamentally different approach might be required.

    Couldn't make it to EmTech Next to learn how deep learning is driving the future of work?

    Go behind the scenes and check out our video

    Related Video

    More videos

    Intelligent Machines

    Next-Generation Robots Need Your Help 27:36

    Intelligent Machines

    AI's Economic Impact 35:20

    Intelligent Machines

    Autonomous Vehicles and Urban Transportation 28:38

    Intelligent Machines

    Solving the Manual Labor Shortage 18:15
    Recommended for You
    1. Machine learning predicts World Cup winner
    2. EOS’s $4 billion crypto-democracy has just launched—and it’s probably going to be ruled by fat cats
    3. Cheap lidar gets a big win in deal with Volvo
    4. America’s new supercomputer beats China’s fastest machine to take title of world’s most powerful
    5. ZTE may have been saved, but its plight could strengthen China’s tech ambitions
    More from Intelligent Machines

    Artificial intelligence and robots are transforming how we work and live.

    Want more award-winning journalism? Subscribe to Insider Plus.
    • Insider Plus {! insider.prices.plus !}*

      {! insider.display.menuOptionsLabel !}

      Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

      {! insider.buttons.plus.buttonText !}
      See details+

      Print + Digital Magazine (6 bi-monthly issues)

      Unlimited online access including all articles, multimedia, and more

      The Download newsletter with top tech stories delivered daily to your inbox

      Technology Review PDF magazine archive, including articles, images, and covers dating back to 1899

      10% Discount to MIT Technology Review events and MIT Press

      Ad-free website experience

    * {! insider.display.footerLabel !}

    See international prices

    See U.S. prices

    Revert to MIT Enterprise Forum pricing

    Revert to standard pricing

    /3
    You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.