Combining Compact Representation and Incremental Generation in Large Games with Sequential Strategies

Many search and security games played on a graph can be modeled as normal-form zero-sum games with strategies consisting of sequences of actions. The size of the strategy space provides a computational challenge when solving these games. This complexity is tackled either by using the compact representation of sequential strategies and linear programming, or by incremental strategy generation of iterative double-oracle methods. In this paper, we present novel hybrid of these two approaches: compact-strategy double-oracle (CS-DO) algorithm that combines the advantages of the compact representation with incremental strategy generation. We experimentally compare CS-DO with the standard approaches and analyze the impact of the size of the support on the performance of the algorithms. Results show that CS-DO dramatically improves the convergence rate in games with non-trivial support.

[1]  B. Stengel,et al.  Efficient Computation of Behavior Strategies , 1996 .

[2]  D. Koller,et al.  The complexity of two-person zero-sum games in extensive form , 1992 .

[3]  Geoffrey J. Gordon,et al.  Robust planning in domains with stochastic outcomes, adversaries, and partial observability , 2006 .

[4]  Milind Tambe,et al.  TRUSTS: Scheduling Randomized Patrols for Fare Inspection in Transit Systems , 2012, IAAI.

[5]  Vincent Conitzer,et al.  A double oracle algorithm for zero-sum security games on graphs , 2011, AAMAS.

[6]  Sarit Kraus,et al.  Game-theoretic randomization for security patrolling with dynamic execution uncertainty , 2013, AAMAS.

[7]  Philipp C. Wichardt Existence of Nash equilibria in finite extensive form games with imperfect recall: A counterexample , 2008, Games Econ. Behav..

[8]  Branislav Bosanský,et al.  Double-oracle algorithm for computing an exact nash equilibrium in zero-sum extensive-form games , 2013, AAMAS.

[9]  Branislav Bosanský,et al.  Extending Security Games to Defenders with Constrained Mobility , 2012, AAAI Spring Symposium: Game Theory for Security, Sustainability, and Health.

[10]  Alan Washburn,et al.  Two-Person Zero-Sum Games for Network Interdiction , 1995, Oper. Res..

[11]  Tuomas Sandholm,et al.  The State of Solving Large Incomplete-Information Games, and Application to Poker , 2010, AI Mag..

[12]  Branislav Bosanský,et al.  Transiting areas patrolled by a mobile adversary , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[13]  Javier Peña,et al.  Smoothing Techniques for Computing Nash Equilibria of Sequential Games , 2010, Math. Oper. Res..

[14]  Vincent Conitzer,et al.  Multi-Step Multi-Sensor Hider-Seeker Games , 2009, IJCAI.

[15]  Geoffrey J. Gordon,et al.  A Unification of Extensive-Form Games and Markov Decision Processes , 2007, AAAI.

[16]  D. Koller,et al.  Efficient Computation of Equilibria for Extensive Two-Person Games , 1996 .

[17]  Milind Tambe,et al.  Security and Game Theory - Algorithms, Deployed Systems, Lessons Learned , 2011 .

[18]  Avrim Blum,et al.  Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.

[19]  Michael H. Bowling,et al.  No-Regret Learning in Extensive-Form Games with Imperfect Recall , 2012, ICML.

[20]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[21]  Vincent Conitzer,et al.  Security scheduling for real-world networks , 2013, AAMAS.