AI Agents Learn to Work Together by Wrangling Virtual Swine

Collaboration and cooperation are crucial elements of human intelligence. Now some algorithms are learning how to work together.

Will Knightarchive page

June 13, 2017

Wrangling a pig—even a virtual one—is much easier if you get a friend to help. This much seems clear from a contest organized by Microsoft researchers to test how artificially intelligent agents could cooperate to solve tricky problems. How best to cooperate with your pig-wrangling pal is another question.

The competition addresses an area of artificial intelligence that has had relatively little attention so far. AI researchers often develop software capable of performing a specific human task, such as playing chess or Go, and then measure it according to its ability to defeat a human player. However, a great deal of human intelligence involves communication, social intelligence, and theory of mind, or the ability to anticipate and interpret another intelligent agent’s intentions.

The project also hints at how humans and AI systems might eventually work together to achieve more than the sum of their parts. “This is part of a broader trend of rethinking AI as augmented intelligence rather than artificial intelligence,” says Oren Etzioni, CEO of the Allen Institute for Artificial Intelligence.

For the Microsoft contest, AI agents worked together inside Project Malmo, a special version of the open-ended computer game Minecraft. Microsoft’s researcher designed this environment to make it straightforward to import and test different AI techniques. Much further progress will be needed before AI agents can team up in useful ways or assist humans, but the contest offers a way to test some early ideas.

For the competition, agents could try to control and catch an unruly virtual pig either on their own or by teaming up with another AI agent, earning points each time.

The top teams in the Malmo Collaborative AI Challenge used cutting-edge machine-learning approaches such as deep learning to train their agents to work together. This entailed feeding them large amounts of data. But some participants also made use of older, less fashionable approaches that involve give a virtual agent hard-coded knowledge and understanding.

The winners of the contest, a team from the University of Oxford in the U.K., used reinforcement learning, a kind of machine learning inspired by the way animals learn through experimentation (see “10 Breakthrough Technologies: Reinforcement Learning”). Their agents experienced positive reinforcement whenever they successfully worked together to grab the pig.

Katja Hofmann, the lead researcher on Microsoft’s Malmo project, notes that many teams combined different approaches. “There was no single type of approach that emerged as a clear winner,” she adds, saying it’s likely that hybrid approaches “will prove particularly promising directions for future research.”

The pig-wrestling challenge takes inspiration from a thought experiment known as the Stag Hunt, which explores concepts within game theory, a branch of mathematics concerned with cooperation and negotiation strategies. The idea is that two hunters must decide whether to hunt a hare on their own or team up to snag the bigger prize of a stag.

The top teams involved in the contest, judged according to the score they achieved as well as the novelty of their work, will receive a $20,000 research grant and a place at Microsoft’s Research AI Summer School.

Pedro Domingos, a professor at the University of Washington who studies machine learning and data mining, says training AI software inside simulated environments has its drawbacks. Software can become overoptimized for that particular environment and therefore less useful in the real world, he says, although more sophisticated simulated worlds are starting to change this.

Domingos adds that cooperation between humans is so complex and subtle that it is hard to imagine the Microsoft project producing genuinely useful approaches. However, despite some skepticism, he is encouraged by the project.

“It’s still early days in this area, and Minecraft is an environment with a lot of possibilities,” Domingos says. “[It’s] richer than things that have been used before, so it certainly seems worth trying.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.