Could AlphaGo Bluff Its Way through Poker?

One of the brains behind Google’s Go-winning software says a similar learning approach makes it as good as a human expert at Texas hold ‘em poker.

Will Knightarchive page

March 30, 2016

One of the scientists responsible for AlphaGo, the Google DeepMind software that trounced one of the world’s best Go players recently, says the same approach can produce a surprisingly competent poker bot.

Unlike board games such as Go or chess, poker is a game of “imperfect information,” and for this reason it has proved even more resistant to computerization than Go.

Gameplay in poker involves devising a strategy based on the cards you have in your hand and a guess as to what’s in your opponents’ hands. Poker players try to read the behavior of others at the table using a combination of statistics and more subtle behavioral cues.

Artificial Intelligence: it's a kind of magic.

Because of this, building an effective poker bot using machine learning may be significant for real-world applications of AI. The game is relevant to game theory, which concerns situations involving negotiation and coöperation.

Although Go is incredibly complex and its strategic principles cannot be encoded easily, AlphaGo was at least able to see every part of the game. AlphaGo used a combination of two AI techniques, deep reinforcement learning and tree search, to come up with winning Go moves. Deep reinforcement learning involves training a large neural network with positive and negative rewards, and tree search is a mathematical strategy for looking ahead in a game.

David Silver, the lead researcher behind AlphaGo and a lecturer at University College London, posted a paper earlier this month describing efforts to build a poker bot using similar techniques.

Together with Johannes Heinrich, a research student at UCL, Silver used deep reinforcement learning to produce effective playing strategy in both Leduc, a simplified version of poker involving a deck of just six cards, and Texas hold’em, the most popular form of the game. With Leduc, the software reached a Nash equilibrium, meaning an optimal approach as defined by game theory. In Texas hold’em, it achieved the performance of an expert human player.

Meanwhile, a team of researchers at the University of Oxford and Google DeepMind have turned their attention to two fantasy-inspired card games—Magic: the Gathering and Hearthstone.

These games involve playing cards representing different spells, weapons, or creatures against opponents. This work is much more preliminary, and simply involved training a neural network to interpret the information shown on each card, which may either be structured, as in a particular color or number, or unstructured, as in text describing what happens when the card is played.

Even so, Google’s AI team clearly isn’t finished with building superhuman game bots.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.