论文信息 - Automated Action Abstraction of Imperfect Information Extensive-Form Games

Automated Action Abstraction of Imperfect Information Extensive-Form Games

Multi-agent decision problems can often be formulated as extensive-form games. We focus on imperfect information extensive-form games in which one or more actions at many decision points have an associated continuous or many-valued parameter. A stock trading agent, in addition to deciding whether to buy or not, must decide how much to buy. In no-limit poker, in addition to selecting a probability for each action, the agent must decide how much to bet for each betting action. Selecting values for these parameters makes these games extremely large. Two-player no-limit Texas Hold'em poker with stacks of 500 big blinds has approximately 1071 states, which is more than 1050 times more states than two-player limit Texas Hold'em. The main contribution of this paper is a technique that abstracts a game's action space by selecting one, or a small number, of the many values for each parameter. We show that strategies computed using this new algorithm for no-limit Leduc poker exhibit significant utility gains over e-Nash equilibrium strategies computed with standard, hand-crafted parameter value abstractions.

Duane Szafron | Robert C. Holte | John Alexander Hawkin

[1] Kevin Waugh,et al. Abstraction pathologies in extensive games , 2009, AAMAS.

[2] Philip Wolfe,et al. Contributions to the theory of games , 1953 .

[3] M.A. Wiering,et al. Reinforcement Learning in Continuous Action Spaces , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[4] Andrea Bonarini,et al. Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods , 2007, NIPS.

[5] David Schnizlein,et al. State translation in no-limit poker , 2009 .

[6] Michael H. Bowling,et al. Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[7] Tuomas Sandholm,et al. Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker , 2007, AAMAS '07.

[8] Csaba Szepesvári,et al. Fitted Q-iteration in continuous action-space MDPs , 2007, NIPS.

[9] Kee-Eung Kim,et al. Solving Stochastic Planning Problems with Large State and Action Spaces , 1998, AIPS.

[10] S. H. Tijs. Stochastic games with one big action space in each state , 1980 .

[11] Troels Bjerre Lund,et al. Potential-Aware Automated Abstraction of Sequential Games, and Holistic Equilibrium Analysis of Texas Hold'em Poker , 2007, AAAI.

[12] Javier Peña,et al. Gradient-Based Algorithms for Finding Nash Equilibria in Extensive Form Games , 2007, WINE.

[13] Kevin Waugh,et al. A Practical Use of Imperfect Recall , 2009, SARA.

[14] Bill Chen,et al. The Mathematics of Poker , 2006 .

[15] Vincent Corruble,et al. Designing a Reinforcement Learning-based Adaptive AI for Large-Scale Strategy Games , 2006, AIIDE.

[16] Kevin Waugh,et al. Strategy Grafting in Extensive Games , 2009, NIPS.

[17] Troels Bjerre Lund,et al. A heads-up no-limit Texas Hold'em poker player: discretized betting models and automatically generated equilibrium-finding programs , 2008, AAMAS.