Google DeepMind’s new AI system can solve complex geometry problems

Its performance matches the smartest high school mathematicians and is much stronger than the previous state-of-the-art system.

June Kimarchive page

January 17, 2024

Sarah Rogers/MITTR | Getty

Google DeepMind has created an AI system that can solve complex geometry problems. It’s a significant step toward machines with more human-like reasoning skills, experts say.

Geometry, and mathematics more broadly, have challenged AI researchers for some time. Compared with text-based AI models, there is significantly less training data for mathematics because it is symbol driven and domain specific, says Thang Luong, a coauthor of the research, which is published in Nature today.

Solving mathematics problems requires logical reasoning, something that most current AI models aren’t great at. This demand for reasoning is why mathematics serves as an important benchmark to gauge progress in AI intelligence, says Luong.

DeepMind’s program, named AlphaGeometry, combines a language model with a type of AI called a symbolic engine, which uses symbols and logical rules to make deductions. Language models excel at recognizing patterns and predicting subsequent steps in a process. However, their reasoning lacks the rigor required for mathematical problem-solving. The symbolic engine, on the other hand, is based purely on formal logic and strict rules, which allows it to guide the language model toward rational decisions.

These two approaches, responsible for creative thinking and logical reasoning respectively, work together to solve difficult mathematical problems. This closely mimics how humans work through geometry problems, combining their existing understanding with explorative experimentation.

DeepMind says it tested AlphaGeometry on 30 geometry problems at the same level of difficulty found at the International Mathematical Olympiad, a competition for top high school mathematics students. It completed 25 within the time limit. The previous state-of-the-art system, developed by the Chinese mathematician Wen-Tsün Wu in 1978, completed only 10.

“This is a really impressive result,” says Floris van Doorn, a mathematics professor at the University of Bonn, who was not involved in the research. “I expected this to still be multiple years away.”

DeepMind says this system demonstrates AI’s ability to reason and discover new mathematical knowledge.

“This is another example that reinforces how AI can help us advance science and better understand the underlying processes that determine how the world works,” said Quoc V. Le, a scientist at Google DeepMind and one of the authors of the research, at a press conference.

When presented with a geometry problem, AlphaGeometry first attempts to generate a proof using its symbolic engine, driven by logic. If it cannot do so using the symbolic engine alone, the language model adds a new point or line to the diagram. This opens up additional possibilities for the symbolic engine to continue searching for a proof. This cycle continues, with the language model adding helpful elements and the symbolic engine testing new proof strategies, until a verifiable solution is found.

To train AlphaGeometry's language model, the researchers had to create their own training data to compensate for the scarcity of existing geometric data. They generated nearly half a billion random geometric diagrams and fed them to the symbolic engine. This engine analyzed each diagram and produced statements about its properties. These statements were organized into 100 million synthetic proofs to train the language model.

Roman Yampolskiy, an associate professor of computer science and engineering at the University of Louisville who was not involved in the research, says that AlphaGeometry’s ability shows a significant advancement toward more “sophisticated, human-like problem-solving skills in machines.”

“Beyond mathematics, its implications span across fields that rely on geometric problem-solving, such as computer vision, architecture, and even theoretical physics,” said Yampoliskiy in an email.

However, there is room for improvement. While AlphaGeometry can solve problems found in “elementary” mathematics, it remains unable to grapple with the sorts of advanced, abstract problems taught at university.

“Mathematicians would be really interested if AI can solve problems that are posed in research mathematics, perhaps by having new mathematical insights,” said van Doorn.

Luong says the goal is to apply a similar approach to broader math fields. “Geometry is just an example for us to demonstrate that we are on the verge of AI being able to do deep reasoning,” he says.

Correction: This story was updated to correct one of the study author's last names.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

Will Douglas Heavenarchive page

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Will Douglas Heavenarchive page

What’s next for generative video

OpenAI's Sora has raised the bar for AI moviemaking. Here are four things to bear in mind as we wrap our heads around what's coming.

Will Douglas Heavenarchive page

The AI Act is done. Here’s what will (and won’t) change

The hard work starts now.

Melissa Heikkiläarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Google DeepMind’s new AI system can solve complex geometry problems

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review