Algorithms for abstracting and solving imperfect information games

Game theory is the mathematical study of rational behavior in strategic environments. In many settings, most notably two-person zero-sum games, game theory provides particularly strong and appealing solution concepts. Furthermore, these solutions are efficiently computable in the complexity-theory sense. However, in most interesting potential applications in artificial intelligence, the solutions are difficult to compute using current techniques due primarily to the extremely large state-spaces of the environments. In this thesis, we propose new algorithms for tackling these computational difficulties. In one stream of research, we introduce automated abstraction algorithms for sequential games of imperfect information. These algorithms take as input a description of a game and produce a description of a strategically similar, but smaller, game as output. We present algorithms that are lossless (i.e., equilibrium-preserving), as well as algorithms that are lossy, but which can yield much smaller games while still retaining the most important features of the original game. In a second stream of research, we develop specialized optimization algorithms for finding -equilibria in sequential games of imperfect information. The algorithms are based on recent advances in nonsmooth convex optimization (namely the excessive gap technique) and provide significant improvements over previous algorithms for finding -equilibria. Combining these two streams, we enable the application of game theory to games extremely larger than was previously possible. As in illustrative example, we find near-optimal solutions for a four-round model of Texas Hold’em poker, and demonstrate that the resulting player is significantly better than previous computer poker players. In addition to the above (already completed) work, we discuss how the same techniques can be used to construct an agent for no-limit Texas Hold’em poker (a game with an infinite number of pure strategies). We propose coming up with worst-case guarantees (both ex ante and ex post) for automated abstraction algorithms. We also propose a regret-minimizing pure strategy solution concept appropriate for sequential games with many players, and propose an algorithm for computing this concept. Finally, we propose specialized interior-point algorithms for equilibrium computation in extensive form games (possibly for computing equilibrium refinements such as sequential equilibrium) as well as a prioritized updating scheme for speeding up the excessive gap technique family of algorithms.

[1]  Fabián A. Chudak,et al.  Improved Approximation Schemes for Linear Programming Relaxations of Combinatorial Optimization Problems , 2005, IPCO.

[2]  Xiaotie Deng,et al.  On the complexity of equilibria , 2002, STOC '02.

[3]  Dana S. Nau,et al.  Computer Bridge - A Big Win for AI Planning , 1998, AI Mag..

[4]  Bernhard von Stengel,et al.  Exponentially many steps for finding a Nash equilibrium in a bimatrix game , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[5]  Donald J. Newman A Model for “Real” Poker , 1959 .

[6]  Eitan Zemel,et al.  Nash and correlated equilibria: Some complexity considerations , 1989 .

[7]  David H. Reiley,et al.  Stripped-Down Poker: A Classroom Game with Signaling and Bluffing , 2008 .

[8]  Matthew L. Ginsberg,et al.  Partition Search , 1996, AAAI/IAAI, Vol. 1.

[9]  Kevin Burns,et al.  Heads-Up Face-Off: On Style and Skill in the Game of Poker , 2004, AAAI Technical Report.

[10]  Jonathan Schaeffer,et al.  Opponent Modeling in Poker , 1998, AAAI/IAAI.

[11]  Ian Davidson,et al.  Speeding up k-means Clustering by Bootstrap Averaging , 2003 .

[12]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[13]  Amin Saberi,et al.  Approximating Market Equilibria , 2003, RANDOM-APPROX.

[14]  L. Friedman Optimal Bluffing Strategies in Poker , 1971 .

[15]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[16]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[17]  Vincent Conitzer,et al.  Complexity Results about Nash Equilibria , 2002, IJCAI.

[18]  Herbert E. Scarf,et al.  The Approximation of Fixed Points of a Continuous Mapping , 1967 .

[19]  Bernhard von Stengel,et al.  Computing Normal Form Perfect Equilibria for Extensive Two-Person Games , 2002 .

[20]  J. M. Bilbao,et al.  Contributions to the Theory of Games , 2005 .

[21]  Kevin B. Korb,et al.  Bayesian Poker , 1999, UAI.

[22]  P. Reny,et al.  On the Strategic Equivalence of Extensive Form Games , 1994 .

[23]  Yurii Nesterov,et al.  Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[24]  Robert E. Tarjan,et al.  Efficiency of a Good But Not Linear Set Union Algorithm , 1972, JACM.

[25]  B. Stengel,et al.  Efficient Computation of Behavior Strategies , 1996 .

[26]  Geoffrey J. Gordon,et al.  A Fast Bundle-based Anytime Algorithm for Poker and other Convex Games , 2007, AISTATS.

[27]  D. Koller,et al.  The complexity of two-person zero-sum games in extensive form , 1992 .

[28]  D. Koller,et al.  Finding mixed strategies with small supports in extensive form games , 1996 .

[29]  Richard J. Lipton,et al.  Simple strategies for large zero-sum games with applications to complexity theory , 1994, STOC '94.

[30]  Tuomas Sandholm,et al.  Optimal Rhode Island Hold'em Poker , 2005, AAAI.

[31]  Gerald Tesauro,et al.  Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[32]  Michael P. Wellman,et al.  Computing approximate bayes-nash equilibria in tree-games of incomplete information , 2004, EC '04.

[33]  Nicholas V. Findler,et al.  Studies in machine cognition using the game of poker , 1977, CACM.

[34]  H. Kuk On equilibrium points in bimatrix games , 1996 .

[35]  Tuomas Sandholm,et al.  A Texas Hold'em poker player based on automated abstraction and real-time equilibrium computation , 2006, AAMAS '06.

[36]  J. J. Stone,et al.  A symmetric continuous poker model , 1960 .

[37]  Moshe Tennenholtz,et al.  Local-Effect Games , 2003, IJCAI.

[38]  Kevin Leyton-Brown,et al.  Computing Nash Equilibria of Action-Graph Games , 2004, UAI.

[39]  L. S. Shapley,et al.  10. A SIMPLE THREE-PERSON POKER GAME , 1951 .

[40]  Michael H. Bowling,et al.  Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.

[41]  Peter Bro Miltersen,et al.  Computing Proper Equilibria of Zero-Sum Games , 2006, Computers and Games.

[42]  Jonathan Schaeffer,et al.  The challenge of poker , 2002, Artif. Intell..

[43]  William H. Cutler An Optimal Strategy for Pot-Limit Poker , 1975 .

[44]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[45]  Gérard Cornuéjols,et al.  An algorithmic framework for convex mixed integer nonlinear programs , 2008, Discret. Optim..

[46]  Rickard Andersson Pseudo-Optimal Strategies in No-Limit Poker , 2006, J. Int. Comput. Games Assoc..

[47]  Andrew W. Moore,et al.  Accelerating exact k-means algorithms with geometric reasoning , 1999, KDD '99.

[48]  Peter Bro Miltersen,et al.  Computing sequential equilibria for two-player games , 2006, SODA '06.

[49]  Jonathan Schaeffer,et al.  One jump ahead - challenging human supremacy in checkers , 1997, J. Int. Comput. Games Assoc..

[50]  Tuomas Sandholm,et al.  Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker , 2007, AAMAS '07.

[51]  E. Zeidler The Implicit Function Theorem , 1995 .

[52]  J. Nash Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[53]  Javier Peña,et al.  A GRADIENT-BASED APPROACH FOR COMPUTING NASH EQUILIBRIA OF LARGE SEQUENTIAL GAMES , 2007 .

[54]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[55]  Tim Roughgarden,et al.  Computing equilibria in multi-player games , 2005, SODA '05.

[56]  Jonathan Schaeffer,et al.  The games computers (and people) play , 2000, Adv. Comput..

[57]  Nikhil R. Devanur,et al.  Market equilibrium via a primal-dual-type algorithm , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[58]  R BELLMAN,et al.  Some two person games involving bluffing. , 1949, Proceedings of the National Academy of Sciences of the United States of America.

[59]  Matthew L. Ginsberg,et al.  GIB: Steps Toward an Expert-Level Bridge-Playing Program , 1999, IJCAI.

[60]  Avi Pfeffer,et al.  Representations and Solutions for Game-Theoretic Problems , 1997, Artif. Intell..

[61]  Aranyak Mehta,et al.  Playing large games using simple strategies , 2003, EC '03.

[62]  Javier Peña,et al.  Gradient-Based Algorithms for Finding Nash Equilibria in Extensive Form Games , 2007, WINE.

[63]  Robert Wilson,et al.  A global Newton method to compute Nash equilibria , 2003, J. Econ. Theory.

[64]  E. Berlekamp,et al.  Winning Ways for Your Mathematical Plays , 1983 .

[65]  Rufus Isaacs,et al.  A Card Game with Bluffing , 1955 .

[66]  A. Mas-Colell,et al.  Microeconomic Theory , 1995 .

[67]  D. Koller,et al.  Efficient Computation of Equilibria for Extensive Two-Person Games , 1996 .

[68]  Michael L. Littman,et al.  Abstraction Methods for Game Theoretic Poker , 2000, Computers and Games.

[69]  Roger B. Myerson,et al.  Game theory - Analysis of Conflict , 1991 .

[70]  J. Robinson AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.

[71]  Terence Conrad Schauenberg,et al.  Opponent Modelling and Search in Poker , 2006 .

[72]  Xiaotie Deng,et al.  Settling the Complexity of Two-Player Nash Equilibrium , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[73]  T. Koopmans,et al.  Activity Analysis of Production and Allocation. , 1952 .

[74]  Craig A. Knoblock Automatically Generating Abstractions for Planning , 1994, Artif. Intell..

[75]  R. McKelvey,et al.  Computation of equilibria in finite games , 1996 .

[76]  Andrés Perea,et al.  Rationality in extensive form games , 2001 .

[77]  Daphne Koller,et al.  Multi-Agent Influence Diagrams for Representing and Solving Games , 2001, IJCAI.

[78]  Daphne Koller,et al.  A Continuation Method for Nash Equilibria in Structured Games , 2003, IJCAI.

[79]  Stephen J. Wright Primal-Dual Interior-Point Methods , 1997, Other Titles in Applied Mathematics.

[80]  Richard Bellman On games involving bluffing , 1952 .

[81]  Yoav Shoham,et al.  Simple search methods for finding a Nash equilibrium , 2004, Games Econ. Behav..

[82]  Martin W. P. Savelsbergh,et al.  Branch-and-Price: Column Generation for Solving Huge Integer Programs , 1998, Oper. Res..

[83]  B. Stengel,et al.  COMPUTING EQUILIBRIA FOR TWO-PERSON GAMES , 1996 .

[84]  H. W. Kuhn,et al.  11. Extensive Games and the Problem of Information , 1953 .

[85]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[86]  Renato D. C. Monteiro,et al.  Large-scale semidefinite programming via a saddle point Mirror-Prox algorithm , 2007, Math. Program..

[87]  David Sklansky,et al.  The Theory of Poker , 1999 .

[88]  W. Ackermann Zum Hilbertschen Aufbau der reellen Zahlen , 1928 .

[89]  Troels Bjerre Lund,et al.  Potential-Aware Automated Abstraction of Sequential Games, and Holistic Equilibrium Analysis of Texas Hold'em Poker , 2007, AAAI.

[90]  Tuomas Sandholm,et al.  A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.

[91]  Christos H. Papadimitriou,et al.  Algorithms, Games, and the Internet , 2001, ICALP.

[92]  D. Fudenberg,et al.  Perfect Bayesian equilibrium and sequential equilibrium , 1991 .

[93]  Michael P. Wellman,et al.  On state-space abstraction for anytime evaluation of Bayesian networks , 1996, SGAR.

[94]  Bret Hoehn,et al.  Effective short-term opponent exploitation in simplified poker , 2005, Machine Learning.

[95]  André Casajus,et al.  Weak isomorphism of extensive games , 2003, Math. Soc. Sci..

[96]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[97]  Daniel Dominic Sleator,et al.  Computer analysis of Sprouts , 1999 .

[98]  J. Mertens,et al.  ON THE STRATEGIC STABILITY OF EQUILIBRIA , 1986 .

[99]  Tuomas Sandholm,et al.  Sequences of take-it-or-leave-it offers: near-optimal auctions without full valuation revelation , 2003, AAMAS '06.

[100]  Vincent Conitzer,et al.  Mixed-Integer Programming Methods for Finding Nash Equilibria , 2005, AAAI.

[101]  Jonathan Schaeffer,et al.  Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[102]  Kevin Burns,et al.  Pared-down Poker: Cutting to the Core of Command and Control , 2005, CIG.

[103]  Tuomas Sandholm,et al.  Finding equilibria in large sequential games of imperfect information , 2006, EC '06.

[104]  Jonathan Schaeffer,et al.  Game-Tree Search with Adaptation in Stochastic Imperfect-Information Games , 2004, Computers and Games.

[105]  Peter Bro Miltersen,et al.  A near-optimal strategy for a heads-up no-limit Texas Hold'em poker tournament , 2007, AAMAS '07.

[106]  Reinhard Selten,et al.  Evolutionary stability in extensive two-person games - correction and further development , 1988 .