论文信息 - Potential-Aware Automated Abstraction of Sequential Games, and Holistic Equilibrium Analysis of Texas Hold'em Poker

Potential-Aware Automated Abstraction of Sequential Games, and Holistic Equilibrium Analysis of Texas Hold'em Poker

We present a new abstraction algorithm for sequential imperfect information games. While most prior abstraction algorithms employ a myopic expected-value computation as a similarity metric, our algorithm considers a higher-dimensional space consisting of histograms over abstracted classes of states from later stages of the game. This enables our bottom-up abstraction algorithm to automatically take into account potential: a hand can become relatively better (or worse) over time and the strength of different hands can get resolved earlier or later in the game. We further improve the abstraction quality by making multiple passes over the abstraction, enabling the algorithm to narrow the scope of analysis to information that is relevant given abstraction decisions made for earlier parts of the game. We also present a custom indexing scheme based on suit isomorphisms that enables one to work on significantly larger models than before. We apply the techniques to heads-up limit Texas Hold'em poker. Whereas all prior game theory-based work for Texas Hold'em poker used generic off-the-shelf linear program solvers for the equilibrium analysis of the abstracted game, we make use of a recently developed algorithm based on the excessive gap technique from convex optimization. This paper is, to our knowledge, the first to abstract and game-theoretically analyze all four betting rounds in one run (rather than splitting the game into phases). The resulting player, GS3, beats BluffBot, GS2, Hyperborean, Monash-BPP, Sparbot, Teddy, and Vexbot, each with statistical significance. To our knowledge, those competitors are the best prior programs for the game.

[1] Tuomas Sandholm,et al. Finding equilibria in large sequential games of imperfect information , 2006, EC '06.

[2] Jonathan Schaeffer,et al. Game-Tree Search with Adaptation in Stochastic Imperfect-Information Games , 2004, Computers and Games.

[3] Peter Bro Miltersen,et al. A near-optimal strategy for a heads-up no-limit Texas Hold'em poker tournament , 2007, AAMAS '07.

[4] Darse Billings. Algorithms and assessment in computer poker , 2006 .

[5] Darse Billings,et al. A Tool for the Direct Assessment of Poker Decisions , 2006, J. Int. Comput. Games Assoc..

[6] David Sklansky,et al. The Theory of Poker , 1999 .

[7] Yurii Nesterov,et al. Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[8] D. Koller,et al. The complexity of two-person zero-sum games in extensive form , 1992 .

[9] B. Stengel,et al. Efficient Computation of Behavior Strategies , 1996 .

[10] Nathan R. Sturtevant,et al. Prob-Maxn: Playing N-Player Games with Opponent Models , 2006, AAAI.

[11] Tuomas Sandholm,et al. A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.

[12] Jonathan Schaeffer,et al. The challenge of poker , 2002, Artif. Intell..

[13] Tuomas Sandholm,et al. Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker , 2007, AAMAS '07.

[14] Jonathan Schaeffer,et al. Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[15] Kevin B. Korb,et al. Bayesian Poker , 1999, UAI.

[16] Javier Peña,et al. A GRADIENT-BASED APPROACH FOR COMPUTING NASH EQUILIBRIA OF LARGE SEQUENTIAL GAMES , 2007 .

[17] Michael L. Littman,et al. The 2006 AAAI Computer Poker Competition , 2006 .