Smoothing Method for Approximate Extensive-Form Perfect Equilibrium

Nash equilibrium is a popular solution concept for solving imperfect-information games in practice. However, it has a major drawback: it does not preclude suboptimal play in branches of the game tree that are not reached in equilibrium. Equilibrium refinements can mend this issue, but have experienced little practical adoption. This is largely due to a lack of scalable algorithms. Sparse iterative methods, in particular first-order methods, are known to be among the most effective algorithms for computing Nash equilibria in large-scale two-player zero-sum extensive-form games. In this paper, we provide, to our knowledge, the first extension of these methods to equilibrium refinements. We develop a smoothing approach for behavioral perturbations of the convex polytope that encompasses the strategy spaces of players in an extensive-form game. This enables one to compute an approximate variant of extensive-form perfect equilibria. Experiments show that our smoothing approach leads to solutions with dramatically stronger strategies at information sets that are reached with low probability in approximate Nash equilibria, while retaining the overall convergence rate associated with fast algorithms for Nash equilibrium. This has benefits both in approximate equilibrium finding (such approximation is necessary in practice in large games) where some probabilities are low while possibly heading toward zero in the limit, and exact equilibrium computation where the low probabilities are actually zero.

[1]  Nicholas I. M. Gould,et al.  SIAM Journal on Optimization , 2012 .

[2]  E. Kohlberg,et al.  Foundations of Strategic Equilibrium , 1996 .

[3]  Tuomas Sandholm,et al.  Regret-Based Pruning in Extensive-Form Games , 2015, NIPS.

[4]  R. Allen,et al.  Economic theory , 2018, Integrative Governance.

[5]  Tuomas Sandholm,et al.  Lossless abstraction of imperfect information games , 2007, JACM.

[6]  Neil Burch,et al.  Heads-up limit hold’em poker is solved , 2015, Science.

[7]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[8]  Tuomas Sandholm,et al.  Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning , 2017, ICML.

[9]  Bernhard von Stengel,et al.  Computing Normal Form Perfect Equilibria for Extensive Two-Person Games , 2002 .

[10]  Yurii Nesterov,et al.  Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[11]  Michael H. Bowling,et al.  Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.

[12]  Tuomas Sandholm,et al.  Safe and Nested Subgame Solving for Imperfect-Information Games , 2017, NIPS.

[13]  Kevin Waugh,et al.  Theoretical and Practical Advances on Smoothing for Extensive-Form Games , 2017, EC.

[14]  John Hillas ON THE RELATION BETWEEN PERFECT EQUILIBRIA IN EXTENSIVE FORM GAMES AND PROPER EQUILIBRIA IN NORMAL FORM GAMES , 1996 .

[15]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[16]  Robert M Thrall,et al.  Mathematics of Operations Research. , 1978 .

[17]  Javier Peña,et al.  Smoothing Techniques for Computing Nash Equilibria of Sequential Games , 2010, Math. Oper. Res..

[18]  Tuomas Sandholm,et al.  The State of Solving Large Incomplete-Information Games, and Application to Poker , 2010, AI Mag..

[19]  Kevin Waugh,et al.  Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.

[20]  Nicola Gatti,et al.  Extensive-Form Perfect Equilibrium Computation in Two-Player Games , 2017, AAAI.

[21]  Michael J. Todd,et al.  Mathematical programming , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[22]  Kevin Waugh,et al.  Faster First-Order Methods for Extensive-Form Game Solving , 2015, EC.

[23]  Peter Bro Miltersen,et al.  Fast algorithms for finding proper strategies in game trees , 2008, SODA '08.

[24]  R. Selten Reexamination of the perfectness concept for equilibrium points in extensive games , 1975, Classics in Game Theory.

[25]  Tuomas Sandholm,et al.  Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent , 2015, AAAI Workshop: Computer Poker and Imperfect Information.

[26]  Oriol Carbonell-Nicolau Games and Economic Behavior , 2011 .

[27]  Kevin Waugh,et al.  DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.

[28]  Tuomas Sandholm,et al.  Regret Minimization in Behaviorally-Constrained Zero-Sum Games , 2017, ICML.

[29]  Michael H. Bowling,et al.  Solving Heads-Up Limit Texas Hold'em , 2015, IJCAI.

[30]  Dan Suciu,et al.  Journal of the ACM , 2006 .

[31]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[32]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..