论文信息 - Refining Subgames in Large Imperfect Information Games

Refining Subgames in Large Imperfect Information Games

The leading approach to solving large imperfect information games is to pre-calculate an approximate solution using a simplified abstraction of the full game; that solution is then used to play the original, full-scale game. The abstraction step is necessitated by the size of the game tree. However, as the original game progresses, the remaining portion of the tree (the subgame) becomes smaller. An appealing idea is to use the simplified abstraction to play the early parts of the game and then, once the subgame becomes tractable, to calculate a solution using a finer-grained abstraction in real time, creating a combined final strategy. While this approach is straightforward for perfect information games, it is a much more complex problem for imperfect information games. If the subgame is solved locally, the opponent can alter his play in prior to this subgame to exploit our combined strategy. To prevent this, we introduce the notion of subgame margin, a simple value with appealing properties. If any best response reaches the subgame, the improvement of exploitability of the combined strategy is (at least) proportional to the sub-game margin. This motivates subgame refinements resulting in large positive margins. Unfortunately, current techniques either neglect subgame margin (potentially leading to a large negative subgame margin and drastically more exploitable strategies), or guarantee only non-negative subgame margin (possibly producing the original, unrefined strategy, even if much stronger strategies are possible). Our technique remedies this problem by maximizing the subgame margin and is guaranteed to find the optimal solution. We evaluate our technique using one of the top participants of the AAAI-14 Computer Poker Competition, the leading playground for agents in imperfect information settings.

[1] Sam Ganzfried,et al. Improving Performance in Imperfect-Information Games with Large State and Action Spaces by Solving Endgames , 2013, AAAI 2013.

[2] Troels Bjerre Lund,et al. Potential-Aware Automated Abstraction of Sequential Games, and Holistic Equilibrium Analysis of Texas Hold'em Poker , 2007, AAAI.

[3] Kevin Waugh,et al. Abstraction pathologies in extensive games , 2009, AAMAS.

[4] Michael H. Bowling,et al. Solving Heads-Up Limit Texas Hold'em , 2015, IJCAI.

[5] Ralph Gasser,et al. Experiments in Computer Go Endgames , 1996 .

[6] Michael H. Bowling,et al. Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization , 2012, AAMAS.

[7] Tuomas Sandholm,et al. The State of Solving Large Incomplete-Information Games, and Application to Poker , 2010, AI Mag..

[8] Tuomas Sandholm,et al. Endgame Solving in Large Imperfect-Information Games , 2015, AAAI Workshop: Computer Poker and Imperfect Information.

[9] Branislav Bosanský,et al. Practical Performance of Refinements of Nash Equilibria in Extensive-Form Zero-Sum Games , 2014, ECAI.

[10] Jonathan Schaeffer,et al. Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[11] Michael H. Bowling,et al. Evaluating state-space abstractions in extensive-form games , 2013, AAMAS.

[12] Milan Hladík,et al. Bounding the Support Size in Extensive Form Games with Imperfect Information , 2014, AAAI.

[13] Sarit Kraus,et al. Using Game Theory for Los Angeles Airport Security , 2009, AI Mag..

[14] Richard G. Gibson. Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing Agents , 2014 .

[15] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .

[16] Tuomas Sandholm,et al. A Competitive Texas Hold'em Poker Player via Automated Abstraction and Real-Time Equilibrium Computation , 2006, AAAI.

[17] Milan Hladík,et al. Automatic Public State Space Abstraction in Imperfect Information Games , 2015, AAAI Workshop: Computer Poker and Imperfect Information.