Gradient-Based Algorithms for Finding Nash Equilibria in Extensive Form Games

We present a computational approach to the saddle-point formulation for the Nash equilibria of two-person, zero-sum sequential games of imperfect information. The algorithm is a first-order gradient method based on modern smoothing techniques for non-smooth convex optimization. The algorithm requires O(1/Ɛ) iterations to compute an Ɛ-equilibrium, and the work per iteration is extremely low. These features enable us to find approximate Nash equilibria for sequential games with a tree representation of about 1010 nodes. This is three orders of magnitude larger than what previous algorithms can handle. We present two heuristic improvements to the basic algorithm and demonstrate their efficacy on a range of real-world games. Furthermore, we demonstrate how the algorithm can be customized to a specific class of problems with enormous memory savings.

[1]  D. Koller,et al.  The complexity of two-person zero-sum games in extensive form , 1992 .

[2]  Richard J. Lipton,et al.  Simple strategies for large zero-sum games with applications to complexity theory , 1994, STOC '94.

[3]  B. Stengel,et al.  Efficient Computation of Behavior Strategies , 1996 .

[4]  Avi Pfeffer,et al.  Representations and Solutions for Game-Theoretic Problems , 1997, Artif. Intell..

[5]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[6]  Michael L. Littman,et al.  Abstraction Methods for Game Theoretic Poker , 2000, Computers and Games.

[7]  Jonathan Schaeffer,et al.  The challenge of poker , 2002, Artif. Intell..

[8]  Jonathan Schaeffer,et al.  Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[9]  Aranyak Mehta,et al.  Playing large games using simple strategies , 2003, EC '03.

[10]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[11]  Yurii Nesterov,et al.  Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[12]  Fabián A. Chudak,et al.  Improved Approximation Schemes for Linear Programming Relaxations of Combinatorial Optimization Problems , 2005, IPCO.

[13]  Tuomas Sandholm,et al.  A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.

[14]  Tuomas Sandholm,et al.  A Texas Hold'em poker player based on automated abstraction and real-time equilibrium computation , 2006, AAMAS '06.

[15]  Aranyak Mehta,et al.  A Note on Approximate Nash Equilibria , 2006, WINE.

[16]  Tuomas Sandholm,et al.  Finding equilibria in large sequential games of imperfect information , 2006, EC '06.

[17]  Amin Saberi,et al.  Approximating nash equilibria using small-support strategies , 2007, EC '07.

[18]  Renato D. C. Monteiro,et al.  Large-scale semidefinite programming via a saddle point Mirror-Prox algorithm , 2007, Math. Program..

[19]  Tuomas Sandholm,et al.  Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker , 2007, AAMAS '07.

[20]  Aranyak Mehta,et al.  Progress in approximate nash equilibria , 2007, EC '07.

[21]  Tuomas Sandholm,et al.  Lossless abstraction of imperfect information games , 2007, JACM.

[22]  Fan Chung Graham,et al.  Internet and Network Economics, Third International Workshop, WINE 2007, San Diego, CA, USA, December 12-14, 2007, Proceedings , 2007, WINE.

[23]  Troels Bjerre Lund,et al.  Potential-Aware Automated Abstraction of Sequential Games, and Holistic Equilibrium Analysis of Texas Hold'em Poker , 2007, AAAI.

[24]  Javier Peña,et al.  A GRADIENT-BASED APPROACH FOR COMPUTING NASH EQUILIBRIA OF LARGE SEQUENTIAL GAMES , 2007 .