Sparsified Linear Programming for Zero-Sum Equilibrium Finding

Computational equilibrium finding in large zero-sum extensive-form imperfect-information games has led to significant recent AI breakthroughs. The fastest algorithms for the problem are new forms of counterfactual regret minimization [Brown and Sandholm, 2019]. In this paper we present a totally different approach to the problem, which is competitive and often orders of magnitude better than the prior state of the art. The equilibrium-finding problem can be formulated as a linear program (LP) [Koller et al., 1994], but solving it as an LP has not been scalable due to the memory requirements of LP solvers, which can often be quadratically worse than CFR-based algorithms. We give an efficient practical algorithm that factors a large payoff matrix into a product of two matrices that are typically dramatically sparser. This allows us to express the equilibrium-finding problem as a linear program with size only a logarithmic factor worse than CFR, and thus allows linear program solvers to run on such games. With experiments on poker endgames, we demonstrate in practice, for the first time, that modern linear program solvers are competitive against even game-specific modern variants of CFR in solving large extensive-form games, and can be used to compute exact solutions unlike iterative algorithms like CFR.

[1]  Nicolas Gillis,et al.  On the Complexity of Robust PCA and ℓ1-norm Low-Rank Matrix Approximation , 2015, Math. Oper. Res..

[2]  B. Stengel,et al.  Efficient Computation of Behavior Strategies , 1996 .

[3]  Javier Peña,et al.  First-Order Algorithm with O(ln(1/e)) Convergence for e-Equilibrium in Two-Person Zero-Sum Games , 2008, AAAI.

[4]  Gene H. Golub,et al.  Matrix computations , 1983 .

[5]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[6]  Deyu Meng,et al.  Divide-and-Conquer Method for L1 Norm Matrix Factorization in the Presence of Outliers and Missing Data , 2012, ArXiv.

[7]  Tuomas Sandholm,et al.  Regret-Based Pruning in Extensive-Form Games , 2015, NIPS.

[8]  Javier Peña,et al.  Smoothing Techniques for Computing Nash Equilibria of Sequential Games , 2010, Math. Oper. Res..

[9]  Lingzhou Xue,et al.  A Selective Overview of Sparse Principal Component Analysis , 2018, Proceedings of the IEEE.

[10]  Tuomas Sandholm,et al.  Solving Large Sequential Games with the Excessive Gap Technique , 2018, NeurIPS.

[11]  Kevin Waugh,et al.  Faster First-Order Methods for Extensive-Form Game Solving , 2015, EC.

[12]  Noam Brown,et al.  Superhuman AI for heads-up no-limit poker: Libratus beats top professionals , 2018, Science.

[13]  Tuomas Sandholm,et al.  Solving Imperfect-Information Games via Discounted Regret Minimization , 2018, AAAI.

[14]  Tuomas Sandholm,et al.  Correlation in Extensive-Form Games: Saddle-Point Formulation and Benchmarks , 2019, NeurIPS.

[15]  Jean-Philippe Vert,et al.  Tight convex relaxations for sparse matrix factorization , 2014, NIPS.

[16]  Oskari Tammelin,et al.  Solving Large Imperfect Information Games Using CFR+ , 2014, ArXiv.

[17]  Kevin Waugh,et al.  Accelerating Best Response Calculation in Large Extensive Games , 2011, IJCAI.

[18]  Bernhard von Stengel,et al.  Fast algorithms for finding randomized strategies in game trees , 1994, STOC '94.

[19]  Pradeep Ravikumar,et al.  Sparse Linear Programming via Primal and Dual Augmented Coordinate Descent , 2015, NIPS.

[20]  Neil Burch,et al.  Heads-up limit hold’em poker is solved , 2015, Science.

[21]  Michael H. Bowling,et al.  Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.