Skip to Content
Artificial intelligence

A hybrid AI model lets it reason about the world’s physics like a child

March 6, 2020
A red rubber ball hits a blue rubber cylinder that continues on to hit a metal cylinder.
A red rubber ball hits a blue rubber cylinder that continues on to hit a metal cylinder.Courtesy of MIT-IBM Watson AI Lab

A new data set reveals just how bad AI is at reasoning—and suggests that a new hybrid approach might be the best way forward.

Questions, questions: Known as CLEVRER, the data set consists of 20,000 short synthetic video clips and more than 300,000 question and answer pairings that reason about the events in the videos. Each video shows a simple world of toy objects that collide with one another following simulated physics. In one, a red rubber ball hits a blue rubber cylinder, which continues on to hit a metal cylinder.

The questions fall into four categories: descriptive (e.g., “What shape is the object that collides with the cyan cylinder?”), explanatory (“What is responsible for the gray cylinder’s collision with the cube?”), predictive (“Which event will happen next?”), and counterfactual (“Without the gray object, which event will not happen?”). The questions mirror many of the concepts that children learn early on as they explore their surroundings. But the latter three categories, which specifically require causal reasoning to answer, often stump deep-learning systems.

Fail: The data set, created by researchers at Harvard, DeepMind, and MIT-IBM Watson AI Lab is meant to help evaluate how well AI systems can reason. When the researchers tested several state-of-the-art computer vision and natural language models with the data set, they found that all of them did well on the descriptive questions but poorly on the others.

Mixing the old and the new: The team then tried a new AI system that combines both deep learning and symbolic logic. Symbolic systems used to be all the rage before they were eclipsed by machine learning in the late 1980s. But both approaches have their strengths: deep learning excels at scalability and pattern recognition; symbolic systems are better at abstraction and reasoning.

The composite system, known as a neuro-symbolic model, leverages both: it uses a neural network to recognize the colors, shapes, and materials of the objects and a symbolic system to understand the physics of their movements and the causal relationships between them. It outperformed existing models across all categories of questions.

Why it matters: As children, we learn to observe the world around us, infer why things happened and make predictions about what will happen next. These predictions help us make better decisions, navigate our environments, and stay safe. Replicating that kind of causal understanding in machines will similarly equip them to interact with the world in a more intelligent way. 

Deep Dive

Artificial intelligence

What does GPT-3 “know” about me? 

Large language models are trained on troves of personal data hoovered from the internet. So I wanted to know: What does it have on me?

DeepMind has predicted the structure of almost every protein known to science

And it’s giving the data away for free, which could spur new scientific discoveries.

An AI that can design new proteins could help unlock new cures and materials 

The machine-learning tool could help researchers discover entirely new proteins not yet known to science.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.