论文信息 - Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker

Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker

We present new approximation methods for computing game-theoretic strategies for sequential games of imperfect information. At a high level, we contribute two new ideas. First, we introduce a new state-space abstraction algorithm. In each round of the game, there is a limit to the number of strategically different situations that an equilibrium-finding algorithm can handle. Given this constraint, we use clustering to discover similar positions, and we compute the abstraction via an integer program that minimizes the expected error at each stage of the game. Second, we present a method for computing the leaf payoffs for a truncated version of the game by simulating the actions in the remaining portion of the game. This allows the equilibrium-finding algorithm to take into account the entire game tree while having to explicitly solve only a truncated version. Experiments show that each of our two new techniques improves performance dramatically in Texas Hold'em poker. The techniques lead to a drastic improvement over prior approaches for automatically generating agents, and our agent plays competitively even against the best agents overall.

Tuomas Sandholm | Andrew Gilpin | T. Sandholm | Andrew Gilpin

[1] Philip Wolfe,et al. Contributions to the theory of games , 1953 .

[2] Tuomas Sandholm,et al. A Texas Hold'em poker player based on automated abstraction and real-time equilibrium computation , 2006, AAMAS '06.

[3] Tuomas Sandholm,et al. A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.

[4] David H. Reiley,et al. Stripped-Down Poker: A Classroom Game with Signaling and Bluffing , 2008 .

[5] Kevin B. Korb,et al. Bayesian Poker , 1999, UAI.

[6] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .

[7] David Sklansky,et al. The Theory of Poker , 1999 .

[8] J. Neumann,et al. Theory of Games and Economic Behavior. , 1945 .

[9] Rickard Andersson. Pseudo-Optimal Strategies in No-Limit Poker , 2006, J. Int. Comput. Games Assoc..

[10] Andrew W. Moore,et al. Accelerating exact k-means algorithms with geometric reasoning , 1999, KDD '99.

[11] Peter Bro Miltersen,et al. Computing sequential equilibria for two-player games , 2006, SODA '06.

[12] Terence Conrad Schauenberg,et al. Opponent Modelling and Search in Poker , 2006 .

[13] Tuomas Sandholm,et al. Finding equilibria in large sequential games of imperfect information , 2006, EC '06.

[14] Jonathan Schaeffer,et al. Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[15] Nicholas V. Findler,et al. Studies in machine cognition using the game of poker , 1977, CACM.

[16] B. Stengel,et al. Efficient Computation of Behavior Strategies , 1996 .

[17] F. M. Weida,et al. Traite du Calcul des Probabilites et des ses Applications.@@@Applications aus Jeux de Hazard. , 1941 .

[18] Jonathan Schaeffer,et al. Game-Tree Search with Adaptation in Stochastic Imperfect-Information Games , 2004, Computers and Games.

[19] Aaas News,et al. Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[20] Michael H. Bowling,et al. Optimal Unbiased Estimators for Evaluating Agent Performance , 2006, AAAI.

[21] Bret Hoehn,et al. Effective short-term opponent exploitation in simplified poker , 2005, Machine Learning.

[22] Peter Bro Miltersen,et al. A near-optimal strategy for a heads-up no-limit Texas Hold'em poker tournament , 2007, AAMAS '07.

[23] H. Kuhn. 9. A SIMPLIFIED TWO-PERSON POKER , 1951 .

[24] J. J. Stone,et al. A symmetric continuous poker model , 1960 .

[25] Rufus Isaacs,et al. A Card Game with Bluffing , 1955 .

[26] L. Friedman. Optimal Bluffing Strategies in Poker , 1971 .

[27] Laurence A. Wolsey,et al. Integer and Combinatorial Optimization , 1988 .

[28] L. S. Shapley,et al. 10. A SIMPLE THREE-PERSON POKER GAME , 1951 .

[29] William H. Cutler. An Optimal Strategy for Pot-Limit Poker , 1975 .

[30] Jonathan Schaeffer,et al. Opponent Modeling in Poker , 1998, AAAI/IAAI.

[31] Richard Bellman. On games involving bluffing , 1952 .

[32] Ian Davidson,et al. Speeding up k-means Clustering by Bootstrap Averaging , 2003 .

[33] David M. Kreps,et al. Sequential Equilibria Author ( s ) : , 1982 .

[34] Kevin Burns,et al. Pared-down Poker: Cutting to the Core of Command and Control , 2005, CIG.

[35] Michael L. Littman,et al. Abstraction Methods for Game Theoretic Poker , 2000, Computers and Games.

[36] R BELLMAN,et al. Some two person games involving bluffing. , 1949, Proceedings of the National Academy of Sciences of the United States of America.

[37] Avi Pfeffer,et al. Representations and Solutions for Game-Theoretic Problems , 1997, Artif. Intell..

[38] Michael H. Bowling,et al. Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.

[39] Jonathan Schaeffer,et al. The challenge of poker , 2002, Artif. Intell..

[40] Kevin Burns,et al. Heads-Up Face-Off: On Style and Skill in the Game of Poker , 2004, AAAI Technical Report.

[41] Donald J. Newman. A Model for “Real” Poker , 1959 .

[42] D. Koller,et al. Efficient Computation of Equilibria for Extensive Two-Person Games , 1996 .

[43] Nathan R. Sturtevant,et al. Prob-Maxn: Playing N-Player Games with Opponent Models , 2006, AAAI.