The Animal-AI Olympics is going to treat AI like a lab rat

The $10,000 competition will test AI with challenges that were originally designed to test animal cognition—to see how close we are to machines that have common sense.

Oscar Schwartzarchive page

April 1, 2019

Ms. Tech; Images: Wikimedia commons

In one of Aesop’s fables, a thirsty crow finds a pitcher with a small amount of water beyond the reach of its beak. After failing to push the pitcher over, the crow drops pebbles in one by one until the water level rises, allowing the bird to have a drink. For Aesop, the fable showed the superiority of intelligence over brute strength.

Two and a half millennia later, we might get to see whether AI could pass Aesop’s ancient intelligence test. In June, researchers will train algorithms to master a suite of tasks that have traditionally been used to test animal cognition. This will be the Animal-AI Olympics, with a share in a $10,000 prize pool on offer.

Usually, AI benchmarks involve mastering a single task, like beating a grandmaster in Go or figuring out how learn a video game from scratch. AI has been extraordinarily successful in such realms. But when you apply the same AI systems to a totally different task, they are generally hopeless. That is why, in the Animal-AI Olympics, the same agent will be subjected to 100 previously unseen tasks. What is being tested is not a particular type of intelligence but the ability for a single agent to adapt to diverse environments. This would demonstrate a limited form of generalized intelligence—a type of common sense that AI will need if it is ever to succeed in our homes or in our daily lives. The competition organizers accept that none of the AI systems will be able to adapt perfectly to every circumstance or post a perfect score. But they hope that the best systems will be able to adapt to tackle the different problems they face.

The Animal-AI Olympics is the creation of a team of researchers at the Leverhulme Centre for the Future of Intelligence in Cambridge, England, along with GoodAI, a Prague-based research institute. The competition is part of a bigger project at the Leverhulme Centre called Kinds of Intelligence, which brings together an interdisciplinary team of animal cognition researchers, computer scientists, and philosophers to consider the differences and similarities between human, animal, and mechanical ways of thinking. And while most of the tasks are typically used as intelligence tests for animals, it will also tiptoe into human territory: some of the challenges are used to test cognition in babies and young children. The group hopes to include more human cognitive tasks in future, more complex, versions of the challenge.

Rather than asking researchers to build physical robots, Marta Halina, the group’s director, and her team developed a virtual environment created with the video-game development software Unity. The setup simulates a lab testing environment for animal cognition, complete with food rewards, walls, and movable objects. Later this month, this simulated “playground,” as Halina calls it, will be released to the AI community, and researchers will be invited to train agents that can navigate it.

The agents will be computer systems that can act autonomously in this environment, much like the AI bots that OpenAI and DeepMind have developed to compete in games like Dota and Starcraft. The competition organizers welcome any type of approach to building these agents and expect that many will opt for reinforcement learning. But they are also hoping that researchers will experiment with new methods—particularly what they call the “cognitive approach,” such as that championed by researchers like Josh Tenenbuam at MIT, which involves simulating human (or, in this case, animal) problem solving and mental processing in a computerized model.

In June, researchers will submit their agents, and the team at Cambridge will run them through 100 tests separated into 10 categories. Matthew Crosby, a postdoctoral researcher at the Leverhulme Center, says that at this stage the tests are being kept secret, so that participants can’t teach the agents specific skills before the competition starts.

The tests will range in difficulty. Some might be as basic as requiring the agent to retrieve food from an environment with no obstacles. Harder tasks will require an understanding of object permanence—knowing that an object is still there even if it is hidden—and the capacity to make a mental model of an environment in order to navigate it in the dark.

According to Crosby, the most challenging aspect of the competition is that the agents will have to be good at all the tests across the board: the winning agent will be the one that shows good performance on average, rather than just an ability to master hard tasks. What is being tested is the capacity to adapt quickly to new situations or translate skills from one type of activity to another, which is a good indicator of general intelligence. For Crosby, this type of flexibility is essential to making AI useful in the real world.

The Animal-AI Olympics is not the first AI research project to take inspiration from animal intelligence. Radhika Nagpal, a professor of computer science at Harvard, investigates what AI might gain from studying the emergent intelligence displayed by schools of fish and flocks of birds. And last year Kiana Ehsani led a team of researchers from the University of Washington and the Allen Institute for AI in training neural networks to, in very a limited range of tasks, think like a dog. Ehsani says she would be interested in participating in the Animal-AI Olympics and sees its goals as aligned with her own.

While these projects have achieved some success, replicating animal intelligence in computational agents is still considered a hard problem. As the pioneering AI researcher Judea Pearl has said, animals’ cognitive skills—the navigation proficiency of cats, a dog’s uncanny sense of smell, the razor-sharp vision of snakes—all vastly outperform anything that can be made in a laboratory.This biological intelligence is the result of hundreds of millions of years of evolution.

“I believe that to have AI perform as intelligently as an animal requires building some of that innate structure into the system,” says Anthony Zador, a professor of neuroscience at Cold Spring Harbor Laboratory. “How you do that is a difficult question that no one has an answer to yet.”

Another complicating factor is that metrics for animal intelligence are themselves contested. In his book Are We Smart Enough to Know How Smart Animals Are? Frans de Waal, director of the Yerkes National Primate Research Center at Emory University, argues that many tests judge mental fitness in animals only by virtue of how similar they are to humans. So instead of testing the limits of their natural behaviors, we train animals to do human-like tasks.

This is partly because accredited scientific experiments in animal cognition have to take place in the lab, far away from an animal’s natural environment. The Animal-AI Olympics adds yet another layer of abstraction from the real world by simulating lab environments on the computer, eliminating not only the natural environment but the embodied experience of animal life.

Crosby acknowledges that there are limitations in using tests from animal intelligence to benchmark AI capability. But he says the project is more about exploring the differences between minds than trying to prove equivalence between artificial and biological cognition. Indeed, he hopes it sheds light on how our own brains work, as well as testing the best in AI.

“What we are actually interested in is discovering how to translate between different types of intelligence,” he says. “If part of what we learn is where this translation fails, that’s a success as far as we’re concerned.”

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.