DeepMind wants to teach AI to play a card game that’s harder than Go

If you’ve ever played the card game Hanabi, you’ll understand when I say it’s unlike any other. It’s a collaborative game in which you have full view of everyone else’s hands but not your own.
To win the game, each player must give the others hints about their hands over a limited number of rounds to arrange all the cards in a specific order. It’s an intense exercise in strategy, inference, and cooperation. That’s why researchers at Google Brain and DeepMind think it’s the perfect game for AI to tackle next.
In a new paper, they argue that unlike the other games AI has mastered, such as chess, Go, and poker, Hanabi requires theory of mind and a higher level of reasoning. Theory of mind is about understanding the mental states of others—and understanding that they may not be the same as your own. It’s a foundational skill that humans use to operate efficiently in the world, and one that we usually pick up when we are very young.
Information in Hanabi is limited both by the number of hints afforded to the players in each game and by what can be communicated in each hint. As a result, an AI agent must also pick up implicit information from the other players’ actions to win the game—a challenge it hasn’t had to face before.
Additionally, it has to learn how to provide the maximum possible information in its own hints and actions to help the other players succeed. If an AI agent can successfully navigate such an imperfect-information environment, the researchers believe, it will be one step closer to cooperating effectively with humans.
These are all novel challenges for the research community and will require new algorithmic advancements that link together the work of several subfields of AI, including reinforcement learning, game theory, and emergent communication—the study of how communication arises between multiple AI agents in collaborative settings.
To confirm this hypothesis, the Google team tested all the current state-of-the-art reinforcement-learning algorithms and found that they perform poorly. In response, they released an open-source Hanabi environment to spur further work within the research community.
“As a researcher I have been fascinated by how AI agents can learn to communicate and cooperate with each other and ultimately also humans,” says Jakob Foerster, one of the paper’s coauthors. “Hanabi presents a unique opportunity for a grand challenge in this area.”
Deep Dive
Artificial intelligence
A Roomba recorded a woman on the toilet. How did screenshots end up on Facebook?
Robot vacuum companies say your images are safe, but a sprawling global supply chain for data from our devices creates risk.
The viral AI avatar app Lensa undressed me—without my consent
My avatars were cartoonishly pornified, while my male colleagues got to be astronauts, explorers, and inventors.
Roomba testers feel misled after intimate images ended up on Facebook
An MIT Technology Review investigation recently revealed how images of a minor and a tester on the toilet ended up on social media. iRobot said it had consent to collect this kind of data from inside homes—but participants say otherwise.
How to spot AI-generated text
The internet is increasingly awash with text written by AI software. We need new tools to detect it.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.