Iterative Algorithm for Solving Two-player Zero-sum Extensive-form Games with Imperfect Information

We develop and evaluate a new exact algorithm for finding Nash equilibria of two-player zero-sum extensive-form games with imperfect information. Our approach is based on the sequence-form representation of the game, and uses an algorithmic framework of double-oracle methods that have been used successfully in other classes of games. The algorithm uses an iterative decomposition, solving restricted games and exploiting fast best-response algorithms to add additional sequences to the game over time. We demonstrate our algorithm on a class of adversarial graph search games motivated by real world border patrolling scenarios. The results indicate that our framework is a promising way to scale up solutions for extensive-form games, reducing both memory and computation time requirements.

[1]  Kevin Waugh,et al.  Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.

[2]  Martin W. P. Savelsbergh,et al.  Branch-and-Price: Column Generation for Solving Huge Integer Programs , 1998, Oper. Res..

[3]  Vincent Conitzer,et al.  A double oracle algorithm for zero-sum security games on graphs , 2011, AAMAS.

[4]  Milind Tambe,et al.  Security and Game Theory: IRIS – A Tool for Strategic Security Allocation in Transportation Networks , 2011, AAMAS 2011.

[5]  Duane Szafron,et al.  Using counterfactual regret minimization to create competitive multiplayer poker agents , 2010, AAMAS 2010.

[6]  Yoav Shoham,et al.  Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .

[7]  Branislav Bosanský,et al.  Extending Security Games to Defenders with Constrained Mobility , 2012, AAAI Spring Symposium: Game Theory for Security, Sustainability, and Health.

[8]  D. Koller,et al.  Efficient Computation of Equilibria for Extensive Two-Person Games , 1996 .

[9]  Javier Peña,et al.  First-order algorithm with $${\mathcal{O}({\rm ln}(1{/}\epsilon))}$$ convergence for $${\epsilon}$$-equilibrium in two-person zero-sum games , 2012, Math. Program..

[10]  Vincent Conitzer,et al.  Multi-Step Multi-Sensor Hider-Seeker Games , 2009, IJCAI.

[11]  B. Stengel,et al.  Efficient Computation of Behavior Strategies , 1996 .

[12]  Geoffrey J. Gordon,et al.  A Fast Bundle-based Anytime Algorithm for Poker and other Convex Games , 2007, AISTATS.

[13]  Avrim Blum,et al.  Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.

[14]  Guillaume Maurice Jean-Bernard Chaslot Chaslot,et al.  Monte-Carlo Tree Search , 2010 .

[15]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[16]  Paolo Ciancarini,et al.  Monte Carlo tree search in Kriegspiel , 2010, Artif. Intell..