Endgame Solving in Large Imperfect-Information Games

The leading approach for computing strong game-theoretic strategies in large imperfect-information games is to first solve an abstracted version of the game offline, then perform a table lookup during game play. We consider a modification to this approach where we solve the portion of the game that we have actually reached in real time to a greater degree of accuracy than in the initial computation. We call this approach endgame solving. Theoretically, we show that endgame solving can produce highly exploitable strategies in some games; however, we show that it can guarantee a low exploitability in certain games where the opponent is given sufficient exploitative power within the endgame. Furthermore, despite the lack of a general worst-case guarantee, we describe many benefits of endgame solving. We present an efficient algorithm for performing endgame solving in large imperfect-information games, and present a new variance-reduction technique for evaluating the performance of an agent that uses endgame solving. Experiments on no-limit Texas Hold'em show that our algorithm leads to significantly stronger performance against the strongest agents from the 2013 AAAI Annual Computer Poker Competition.

[1]  Peter Bro Miltersen,et al.  Computing a quasi-perfect equilibrium of a two-player game , 2010 .

[2]  Tuomas Sandholm,et al.  Action Translation in Extensive-Form Games with Large Action Spaces: Axioms, Paradoxes, and the Pseudo-Harmonic Mapping , 2013, IJCAI.

[3]  R Bellman ON THE APPLICATION OF DYNAMIC PROGRAMING TO THE DETERMINATION OF OPTIMAL PLAY IN CHESS AND CHECKERS. , 1965, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Michael H. Bowling,et al.  Solving Imperfect Information Games Using Decomposition , 2013, AAAI.

[5]  Michael H. Bowling,et al.  Evaluating state-space abstractions in extensive-form games , 2013, AAMAS.

[6]  Bernhard von Stengel,et al.  Fast algorithms for finding randomized strategies in game trees , 1994, STOC '94.

[7]  Tuomas Sandholm,et al.  Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker , 2007, AAMAS '07.

[8]  Javier Peña,et al.  Smoothing Techniques for Computing Nash Equilibria of Sequential Games , 2010, Math. Oper. Res..

[9]  E. Jackson A Time and Space Efficient Algorithm for Approximately Solving Large Imperfect Information Games , 2014 .

[10]  Tuomas Sandholm,et al.  Tartanian7: A Champion Two-Player No-Limit Texas Hold'em Poker-Playing Program , 2015, AAAI.

[11]  Kevin Waugh,et al.  A Practical Use of Imperfect Recall , 2009, SARA.

[12]  Jonathan Schaeffer,et al.  Building the Checkers 10-piece Endgame Databases , 2003, ACG.

[13]  Kevin Waugh,et al.  Strategy purification and thresholding: effective non-equilibrium approaches for playing large games , 2012, AAMAS.

[14]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[15]  Jonathan Schaeffer,et al.  Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[16]  Michael Johanson,et al.  Measuring the Size of Large No-Limit Poker Games , 2013, ArXiv.

[17]  T. Sandholm,et al.  Tartanian 5 : A Heads-Up No-Limit Texas Hold ’ em Poker-Playing Program ∗ , 2012 .

[18]  Peter Bro Miltersen,et al.  Fast algorithms for finding proper strategies in game trees , 2008, SODA '08.

[19]  Peter Bro Miltersen,et al.  Computing Proper Equilibria of Zero-Sum Games , 2006, Computers and Games.

[20]  Tuomas Sandholm,et al.  A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.