论文信息 - Algorithms for abstracting and solving imperfect information games

Algorithms for abstracting and solving imperfect information games

Game theory is the mathematical study of rational behavior in strategic environments. In many settings, most notably two-person zero-sum games, game theory provides particularly strong and appealing solution concepts. Furthermore, these solutions are efficiently computable in the complexity-theory sense. However, in most interesting potential applications in artificial intelligence, the solutions are difficult to compute using current techniques due primarily to the extremely large state-spaces of the environments. In this thesis, we propose new algorithms for tackling these computational difficulties. In one stream of research, we introduce automated abstraction algorithms for sequential games of imperfect information. These algorithms take as input a description of a game and produce a description of a strategically similar, but smaller, game as output. We present algorithms that are lossless (i.e., equilibrium-preserving), as well as algorithms that are lossy, but which can yield much smaller games while still retaining the most important features of the original game. In a second stream of research, we develop specialized optimization algorithms for finding -equilibria in sequential games of imperfect information. The algorithms are based on recent advances in nonsmooth convex optimization (namely the excessive gap technique) and provide significant improvements over previous algorithms for finding -equilibria. Combining these two streams, we enable the application of game theory to games extremely larger than was previously possible. As in illustrative example, we find near-optimal solutions for a four-round model of Texas Hold’em poker, and demonstrate that the resulting player is significantly better than previous computer poker players. In addition to the above (already completed) work, we discuss how the same techniques can be used to construct an agent for no-limit Texas Hold’em poker (a game with an infinite number of pure strategies). We propose coming up with worst-case guarantees (both ex ante and ex post) for automated abstraction algorithms. We also propose a regret-minimizing pure strategy solution concept appropriate for sequential games with many players, and propose an algorithm for computing this concept. Finally, we propose specialized interior-point algorithms for equilibrium computation in extensive form games (possibly for computing equilibrium refinements such as sequential equilibrium) as well as a prioritized updating scheme for speeding up the excessive gap technique family of algorithms.

[1] Fabián A. Chudak,et al. Improved Approximation Schemes for Linear Programming Relaxations of Combinatorial Optimization Problems , 2005, IPCO.

[2] Xiaotie Deng,et al. On the complexity of equilibria , 2002, STOC '02.

[3] Dana S. Nau,et al. Computer Bridge - A Big Win for AI Planning , 1998, AI Mag..

[4] Bernhard von Stengel,et al. Exponentially many steps for finding a Nash equilibrium in a bimatrix game , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[5] Donald J. Newman. A Model for “Real” Poker , 1959 .

[6] Eitan Zemel,et al. Nash and correlated equilibria: Some complexity considerations , 1989 .

[7] David H. Reiley,et al. Stripped-Down Poker: A Classroom Game with Signaling and Bluffing , 2008 .

[8] Matthew L. Ginsberg,et al. Partition Search , 1996, AAAI/IAAI, Vol. 1.

[9] Kevin Burns,et al. Heads-Up Face-Off: On Style and Skill in the Game of Poker , 2004, AAAI Technical Report.

[10] Jonathan Schaeffer,et al. Opponent Modeling in Poker , 1998, AAAI/IAAI.

[11] Ian Davidson,et al. Speeding up k-means Clustering by Bootstrap Averaging , 2003 .

[12] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .

[13] Amin Saberi,et al. Approximating Market Equilibria , 2003, RANDOM-APPROX.

[14] L. Friedman. Optimal Bluffing Strategies in Poker , 1971 .

[15] E. Rowland. Theory of Games and Economic Behavior , 1946, Nature.

[16] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[17] Vincent Conitzer,et al. Complexity Results about Nash Equilibria , 2002, IJCAI.

[18] Herbert E. Scarf,et al. The Approximation of Fixed Points of a Continuous Mapping , 1967 .

[19] Bernhard von Stengel,et al. Computing Normal Form Perfect Equilibria for Extensive Two-Person Games , 2002 .

[20] J. M. Bilbao,et al. Contributions to the Theory of Games , 2005 .

[21] Kevin B. Korb,et al. Bayesian Poker , 1999, UAI.

[22] P. Reny,et al. On the Strategic Equivalence of Extensive Form Games , 1994 .

[23] Yurii Nesterov,et al. Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[24] Robert E. Tarjan,et al. Efficiency of a Good But Not Linear Set Union Algorithm , 1972, JACM.

[25] B. Stengel,et al. Efficient Computation of Behavior Strategies , 1996 .

[26] Geoffrey J. Gordon,et al. A Fast Bundle-based Anytime Algorithm for Poker and other Convex Games , 2007, AISTATS.

[27] D. Koller,et al. The complexity of two-person zero-sum games in extensive form , 1992 .

[28] D. Koller,et al. Finding mixed strategies with small supports in extensive form games , 1996 .

[29] Richard J. Lipton,et al. Simple strategies for large zero-sum games with applications to complexity theory , 1994, STOC '94.

[30] Tuomas Sandholm,et al. Optimal Rhode Island Hold'em Poker , 2005, AAAI.

[31] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[32] Michael P. Wellman,et al. Computing approximate bayes-nash equilibria in tree-games of incomplete information , 2004, EC '04.

[33] Nicholas V. Findler,et al. Studies in machine cognition using the game of poker , 1977, CACM.

[34] H. Kuk. On equilibrium points in bimatrix games , 1996 .

[35] Tuomas Sandholm,et al. A Texas Hold'em poker player based on automated abstraction and real-time equilibrium computation , 2006, AAMAS '06.

[36] J. J. Stone,et al. A symmetric continuous poker model , 1960 .

[37] Moshe Tennenholtz,et al. Local-Effect Games , 2003, IJCAI.

[38] Kevin Leyton-Brown,et al. Computing Nash Equilibria of Action-Graph Games , 2004, UAI.

[39] L. S. Shapley,et al. 10. A SIMPLE THREE-PERSON POKER GAME , 1951 .

[40] Michael H. Bowling,et al. Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.

[41] Peter Bro Miltersen,et al. Computing Proper Equilibria of Zero-Sum Games , 2006, Computers and Games.

[42] Jonathan Schaeffer,et al. The challenge of poker , 2002, Artif. Intell..

[43] William H. Cutler. An Optimal Strategy for Pot-Limit Poker , 1975 .

[44] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[45] Gérard Cornuéjols,et al. An algorithmic framework for convex mixed integer nonlinear programs , 2008, Discret. Optim..

[46] Rickard Andersson. Pseudo-Optimal Strategies in No-Limit Poker , 2006, J. Int. Comput. Games Assoc..

[47] Andrew W. Moore,et al. Accelerating exact k-means algorithms with geometric reasoning , 1999, KDD '99.

[48] Peter Bro Miltersen,et al. Computing sequential equilibria for two-player games , 2006, SODA '06.

[49] Jonathan Schaeffer,et al. One jump ahead - challenging human supremacy in checkers , 1997, J. Int. Comput. Games Assoc..

[50] Tuomas Sandholm,et al. Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker , 2007, AAMAS '07.

[51] E. Zeidler. The Implicit Function Theorem , 1995 .

[52] J. Nash. Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[53] Javier Peña,et al. A GRADIENT-BASED APPROACH FOR COMPUTING NASH EQUILIBRIA OF LARGE SEQUENTIAL GAMES , 2007 .

[54] Yurii Nesterov,et al. Smooth minimization of non-smooth functions , 2005, Math. Program..

[55] Tim Roughgarden,et al. Computing equilibria in multi-player games , 2005, SODA '05.

[56] Jonathan Schaeffer,et al. The games computers (and people) play , 2000, Adv. Comput..

[57] Nikhil R. Devanur,et al. Market equilibrium via a primal-dual-type algorithm , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[58] R BELLMAN,et al. Some two person games involving bluffing. , 1949, Proceedings of the National Academy of Sciences of the United States of America.

[59] Matthew L. Ginsberg,et al. GIB: Steps Toward an Expert-Level Bridge-Playing Program , 1999, IJCAI.

[60] Avi Pfeffer,et al. Representations and Solutions for Game-Theoretic Problems , 1997, Artif. Intell..

[61] Aranyak Mehta,et al. Playing large games using simple strategies , 2003, EC '03.

[62] Javier Peña,et al. Gradient-Based Algorithms for Finding Nash Equilibria in Extensive Form Games , 2007, WINE.

[63] Robert Wilson,et al. A global Newton method to compute Nash equilibria , 2003, J. Econ. Theory.

[64] E. Berlekamp,et al. Winning Ways for Your Mathematical Plays , 1983 .

[65] Rufus Isaacs,et al. A Card Game with Bluffing , 1955 .

[66] A. Mas-Colell,et al. Microeconomic Theory , 1995 .

[67] D. Koller,et al. Efficient Computation of Equilibria for Extensive Two-Person Games , 1996 .

[68] Michael L. Littman,et al. Abstraction Methods for Game Theoretic Poker , 2000, Computers and Games.

[69] Roger B. Myerson,et al. Game theory - Analysis of Conflict , 1991 .

[70] J. Robinson. AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.