Lossy stochastic game abstraction with bounds

Abstraction followed by equilibrium finding has emerged as the leading approach to solving games. Lossless abstraction typically yields games that are still too large to solve, so lossy abstraction is needed. Unfortunately, prior lossy game abstraction algorithms have no guarantees on solution quality. We developed a framework that enables the design of lossy game abstraction algorithms with guarantees on solution quality. It simultaneously handles state and action abstraction. We define a measure of reward approximation error and transition probability error achieved by state and action abstraction in stochastic games such that the regret of the equilibrium found in the abstract game when implemented in the original, unabstracted game is upper-bounded by a function of those measures. We then develop the first lossy game abstraction algorithms with bounds on solution quality. Both of them work level-by-level up from the end of the game. One of the algorithms is greedy and the other is an integer linear program. We also prove that the abstraction problem is NP-complete (even with just action abstraction, 2 agents, and a 1-step game), but point out that this does not mean that the game abstraction problems that occur in practice cannot be solved quickly.

[1]  Michael L. Littman,et al.  Abstraction Methods for Game Theoretic Poker , 2000, Computers and Games.

[2]  Robert Givan,et al.  Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..

[3]  Jonathan Schaeffer,et al.  Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[4]  Peter Stone,et al.  A polynomial-time nash equilibrium algorithm for repeated games , 2003, EC '03.

[5]  Aranyak Mehta,et al.  Playing large games using simple strategies , 2003, EC '03.

[6]  Yishay Mansour,et al.  A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[7]  A. Barto,et al.  An algebraic approach to abstraction in reinforcement learning , 2004 .

[8]  Daniel M. Kane,et al.  On the complexity of two-player win-lose games , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[9]  Tuomas Sandholm,et al.  A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.

[10]  Noga Alon,et al.  Algorithmic construction of sets for k-restrictions , 2006, TALG.

[11]  Tuomas Sandholm,et al.  A Texas Hold'em poker player based on automated abstraction and real-time equilibrium computation , 2006, AAMAS '06.

[12]  Michael P. Wellman Methods for Empirical Game-Theoretic Analysis , 2006, AAAI.

[13]  Michael P. Wellman,et al.  Methods for empirical game-theoretic analysis (extended abstract) , 2006 .

[14]  Tuomas Sandholm,et al.  Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker , 2007, AAMAS '07.

[15]  Tuomas Sandholm,et al.  Lossless abstraction of imperfect information games , 2007, JACM.

[16]  Troels Bjerre Lund,et al.  Potential-Aware Automated Abstraction of Sequential Games, and Holistic Equilibrium Analysis of Texas Hold'em Poker , 2007, AAAI.

[17]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[18]  Tuomas Sandholm,et al.  Expectation-Based Versus Potential-Aware Automated Abstraction in Imperfect Information Games: An Experimental Comparison Using Poker , 2008, AAAI.

[19]  Troels Bjerre Lund,et al.  A heads-up no-limit Texas Hold'em poker player: discretized betting models and automatically generated equilibrium-finding programs , 2008, AAMAS.

[20]  Michael H. Bowling,et al.  Probabilistic State Translation in Extensive Games with Large Action Sets , 2009, IJCAI.

[21]  Satinder P. Singh,et al.  Transfer via soft homomorphisms , 2009, AAMAS.

[22]  Xiaotie Deng,et al.  Settling the complexity of computing two-player Nash equilibria , 2007, JACM.

[23]  Christopher Archibald,et al.  Modeling billiards games , 2009, AAMAS.

[24]  Kevin Waugh,et al.  A Practical Use of Imperfect Recall , 2009, SARA.

[25]  Paul W. Goldberg,et al.  The Complexity of Computing a Nash Equilibrium , 2009, SIAM J. Comput..

[26]  Kevin Waugh,et al.  Abstraction pathologies in extensive games , 2009, AAMAS.

[27]  Duane Szafron,et al.  Automated Action Abstraction of Imperfect Information Extensive-Form Games , 2011, AAAI.

[28]  Kevin Leyton-Brown,et al.  Polynomial-time computation of exact correlated equilibrium in compact games , 2010, EC '11.

[29]  Kevin Waugh,et al.  Strategy purification and thresholding: effective non-equilibrium approaches for playing large games , 2012, AAMAS.