论文信息 - Gradient Methods for Stackelberg Games

Gradient Methods for Stackelberg Games

Stackelberg games are two-stage games in which the first player (called the leader) commits to a strategy, after which the other player (the follower) selects a best-response. These types of games have seen numerous practical application in security settings, where the leader (in this case, a defender) must allocate resources to protect various targets. Real world applications include the scheduling of US federal air marshals to international flights, and resource allocation at LAX airport. However, the best known algorithm for solving general Stackelberg games requires solving Integer Programs, and fails to scale beyond a few (significantly smaller than 100) number of leader actions, or follower types. In this paper, we present a new gradient-based approach for solving large Stackelberg games in security settings. Large-scale control problems are often solved by restricting the controller to a rich parameterized class of policies; the optimal control can then be computed using Monte Carlo gradient methods. We demonstrate that the same approach can be taken in a strategic setting. We evaluate our approach empirically, demonstrating that it can have negligible regret against the leader’s true equilibrium strategy, while scaling to large games.

Michael P. Wellman | Kareem Amin | Satinder P. Singh

[1] Milind Tambe,et al. Security and Game Theory: IRIS – A Tool for Strategic Security Allocation in Transportation Networks , 2011, AAMAS 2011.

[2] Rong Yang,et al. Improving resource allocation strategies against human adversaries in security games: An extended study , 2013, Artif. Intell..

[3] Rong Yang,et al. Computing optimal strategy against quantal response in security games , 2012, AAMAS.

[4] Manish Jain,et al. Quality-bounded solutions for finite Bayesian Stackelberg games: scaling up , 2011, AAMAS.

[5] Manish Jain,et al. Security applications: lessons of real-world deployment , 2009, SECO.

[6] Manish Jain,et al. Computing optimal randomized resource allocations for massive security games , 2009, AAMAS 2009.

[7] Milind Tambe,et al. Approximation methods for infinite Bayesian Stackelberg games: modeling distributional payoff uncertainty , 2011, AAMAS.

[8] Alessandro Vespignani,et al. Reaction–diffusion processes and metapopulation models in heterogeneous networks , 2007, cond-mat/0703129.

[9] Milind Tambe,et al. A unified method for handling discrete and continuous uncertainty in Bayesian Stackelberg games , 2012, AAMAS.

[10] Manish Jain,et al. Software Assistants for Randomized Patrol Planning for the LAX Airport Police and the Federal Air Marshal Service , 2010, Interfaces.

[11] Manish Jain,et al. Computing optimal randomized resource allocations for massive security games , 2009, AAMAS.

[12] R. McKelvey,et al. Quantal Response Equilibria for Normal Form Games , 1995 .

[13] Vincent Conitzer,et al. Stackelberg vs. Nash in Security Games: An Extended Investigation of Interchangeability, Equivalence, and Uniqueness , 2011, J. Artif. Intell. Res..

[14] Sarit Kraus,et al. An efficient heuristic approach for security against multiple adversaries , 2007, AAMAS '07.

[15] Manish Jain,et al. Security Games with Arbitrary Schedules: A Branch and Price Approach , 2010, AAAI.

[16] Sarit Kraus,et al. Playing games for security: an efficient exact algorithm for solving Bayesian Stackelberg games , 2008, AAMAS.

[17] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..