Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning

Iterative algorithms such as Counterfactual Regret Minimization (CFR) are the most popular way to solve large zero-sum imperfect-information games. In this paper we introduce Best-Response Pruning (BRP), an improvement to iterative algorithms such as CFR that allows poorly-performing actions to be temporarily pruned. We prove that when using CFR in zero-sum games, adding BRP will asymptotically prune any action that is not part of a best response to some Nash equilibrium. This leads to provably faster convergence and lower space requirements. Experiments show that BRP results in a factor of 7 reduction in space, and the reduction factor increases with game size.

[1]  David Silver,et al.  Fictitious Self-Play in Extensive-Form Games , 2015, ICML.

[2]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[3]  Milan Hladík,et al.  Refining Subgames in Large Imperfect Information Games , 2016, AAAI.

[4]  Tuomas Sandholm,et al.  Simultaneous Abstraction and Equilibrium Finding in Games , 2015, IJCAI.

[5]  Tuomas Sandholm,et al.  Lossless abstraction of imperfect information games , 2007, JACM.

[6]  Yurii Nesterov,et al.  Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[7]  Kevin Waugh,et al.  Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.

[8]  Kevin Waugh,et al.  Faster First-Order Methods for Extensive-Form Game Solving , 2015, EC.

[9]  Michael H. Bowling,et al.  Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.

[10]  François Pays,et al.  An Interior Point Approach to Large Games of Incomplete Information , 2014, AAAI 2014.

[11]  Milan Hladík,et al.  Bounding the Support Size in Extensive Form Games with Imperfect Information , 2014, AAAI.

[12]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[13]  Tuomas Sandholm,et al.  Strategy-Based Warm Starting for Regret Minimization in Games , 2016, AAAI.

[14]  Javier Peña,et al.  Smoothing Techniques for Computing Nash Equilibria of Sequential Games , 2010, Math. Oper. Res..

[15]  Neil Burch,et al.  Heads-up limit hold’em poker is solved , 2015, Science.

[16]  Tuomas Sandholm,et al.  Dynamic Thresholding and Pruning for Regret Minimization , 2017, AAAI.

[17]  Tuomas Sandholm,et al.  Regret-Based Pruning in Extensive-Form Games , 2015, NIPS.

[18]  Kevin Waugh,et al.  Theoretical and Practical Advances on Smoothing for Extensive-Form Games , 2017, EC.

[19]  Kevin Waugh,et al.  Abstraction pathologies in extensive games , 2009, AAMAS.

[20]  Michael H. Bowling,et al.  Solving Heads-Up Limit Texas Hold'em , 2015, IJCAI.