Approximating maxmin strategies in imperfect recall games using A-loss recall property

Abstract Extensive-form games with imperfect recall are an important model of dynamic games where the players are allowed to forget previously known information. Often, imperfect recall games result from an abstraction algorithm that simplifies a large game with perfect recall. Solving imperfect recall games is known to be a hard problem, and thus it is useful to search for a subclass of imperfect recall games which offers sufficient memory savings while being efficiently solvable. The abstraction process can then be guided to result in a game from this class. We focus on a subclass of imperfect recall games called A-loss recall games. First, we provide a complete picture of the complexity of solving imperfect recall and A-loss recall games. We show that the A-loss recall property allows us to compute a best response in polynomial time (computing a best response is NP -hard in imperfect recall games). This allows us to create a practical algorithm for approximating maxmin strategies in two-player games where the maximizing player has imperfect recall and the minimizing player has A-loss recall. This algorithm is capable of solving some games with up to 5 ⋅ 10 9 states in approximately 1 hour. Finally, we demonstrate that the use of imperfect recall abstraction can reduce the size of the strategy representation to as low as 0.03 % of the size of the strategy representation in the original perfect recall game without sacrificing the quality of the maxmin strategy obtained by solving this abstraction.

[1]  Philipp C. Wichardt Existence of Nash equilibria in finite extensive form games with imperfect recall: A counterexample , 2008, Games Econ. Behav..

[2]  Branislav Bosanský,et al.  Combining Incremental Strategy Generation and Branch and Bound Search for Computing Maxmin Strategies in Imperfect Recall Games , 2017, AAMAS.

[3]  B. Stengel,et al.  Efficient Computation of Behavior Strategies , 1996 .

[4]  Avrim Blum,et al.  Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.

[5]  H. W. Kuhn,et al.  11. Extensive Games and the Problem of Information , 1953 .

[6]  Tuomas Sandholm,et al.  Simultaneous Abstraction and Equilibrium Finding in Games , 2015, IJCAI.

[7]  Kevin Waugh,et al.  DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.

[8]  Troels Bjerre Lund,et al.  Potential-Aware Automated Abstraction of Sequential Games, and Holistic Equilibrium Analysis of Texas Hold'em Poker , 2007, AAAI.

[9]  Kristoffer Arnsfelt Hansen,et al.  Finding Equilibria in Games of No Chance , 2007, COCOON.

[10]  D. Koller,et al.  Efficient Computation of Equilibria for Extensive Two-Person Games , 1996 .

[11]  Branislav Bosanský,et al.  Algorithms for computing strategies in two-player simultaneous move games , 2016, Artif. Intell..

[12]  Michael H. Bowling,et al.  No-Regret Learning in Extensive-Form Games with Imperfect Recall , 2012, ICML.

[13]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[14]  Branislav Bosanský,et al.  Towards Solving Imperfect Recall Games , 2017, AAAI Workshops.

[15]  Annamária Kovács,et al.  Bayesian Combinatorial Auctions , 2008, ICALP.

[16]  Kousha Etessami,et al.  On the Complexity of Nash Equilibria and Other Fixed Points , 2010, SIAM J. Comput..

[17]  Geoffrey J. Gordon No-regret Algorithms for Online Convex Programs , 2006, NIPS.

[18]  Michael H. Bowling,et al.  Counterfactual Regret Minimization in Sequential Security Games , 2016, AAAI.

[19]  Tuomas Sandholm,et al.  Imperfect-Recall Abstractions with Bounds in Games , 2014, EC.

[20]  Tuomas Sandholm,et al.  Extensive-form game abstraction with bounds , 2014, EC.

[21]  J. Jude Kline,et al.  Minimum Memory for Equivalence between Ex Ante Optimality and Time-Consistency , 2002, Games Econ. Behav..

[22]  Kevin Waugh,et al.  DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker , 2017, ArXiv.

[23]  Branislav Bosanský,et al.  Combining Compact Representation and Incremental Generation in Large Games with Sequential Strategies , 2015, AAAI.

[24]  Bo An,et al.  PAWS - A Deployed Game-Theoretic Application to Combat Poaching , 2017, AI Mag..

[25]  Ronald L. Graham,et al.  Some NP-complete geometric problems , 1976, STOC '76.

[26]  Pedro M. Castro,et al.  Global optimization of bilinear programs with a multiparametric disaggregation technique , 2013, J. Glob. Optim..

[27]  Tuomas Sandholm,et al.  Lossless abstraction of imperfect information games , 2007, JACM.

[28]  D. Koller,et al.  The complexity of two-person zero-sum games in extensive form , 1992 .

[29]  Tuomas Sandholm,et al.  Steering Evolution Strategically: Computational Game Theory and Opponent Exploitation for Treatment Planning, Drug Design, and Synthetic Biology , 2015, AAAI.

[30]  Ariel Rubinstein,et al.  On the Interpretation of Decision Problems with Imperfect Recall , 1996, TARK.

[31]  M. Kaneko,et al.  Behavior strategies, mixed strategies and perfect recall , 1995 .

[32]  Branislav Bosanský,et al.  An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information , 2014, J. Artif. Intell. Res..

[33]  J. Nash Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Javier Peña,et al.  Smoothing Techniques for Computing Nash Equilibria of Sequential Games , 2010, Math. Oper. Res..

[35]  Branislav Bosanský,et al.  Computing Maxmin Strategies in Extensive-form Zero-sum Games with Imperfect Recall , 2017, ICAART.