Computing optimal strategy against quantal response in security games

To step beyond the first-generation deployments of attacker-defender security games -- for LAX Police, US FAMS and others -- it is critical that we relax the assumption of perfect rationality of the human adversary. Indeed, this assumption is a well-accepted limitation of classical game theory and modeling human adversaries' bounded rationality is critical. To this end, quantal response (QR) has provided very promising results to model human bounded rationality. However, in computing optimal defender strategies in real-world security games against a QR model of attackers, we face difficulties including (1) solving a nonlinear non-convex optimization problem efficiently for massive real-world security games; and (2) addressing constraints on assigning security resources, which adds to the complexity of computing the optimal defender strategy. This paper presents two new algorithms to address these difficulties: Gosaq can compute the globally optimal defender strategy against a QR model of attackers when there are no resource constraints and gives an efficient heuristic otherwise; Pasaq in turn provides an efficient approximation of the optimal defender strategy with or without resource constraints. These two novel algorithms are based on three key ideas: (i) use of a binary search method to solve the fractional optimization problem efficiently, (ii) construction of a convex optimization problem through a non-linear transformation, (iii) building a piecewise linear approximation of the non-linear terms in the problem. Additional contributions of this paper include proofs of approximation bounds, detailed experimental results showing the advantages of Gosaq and Pasaq in solution quality over the benchmark algorithm (Brqr) and the efficiency of Pasaq. Given these results, Pasaq is at the heart of the PROTECT system, which is deployed for the US Coast Guard in the port of Boston, and is now headed to other ports.

[1]  D. Stahl,et al.  Experimental evidence on players' models of other players , 1994 .

[2]  Yevgeniy Vorobeychik,et al.  Computing Randomized Security Strategies in Networked Domains , 2011, Applied Adversarial Reasoning and Risk Modeling.

[3]  Theodore L. Turocy A dynamic homotopy interpretation of the logistic quantal response equilibrium correspondence , 2005, Games Econ. Behav..

[4]  Colin Camerer,et al.  A Cognitive Hierarchy Model of Games , 2004 .

[5]  Nicola Basilico,et al.  Leader-follower strategies for robotic patrolling in environments with arbitrary topologies , 2009, AAMAS.

[6]  Milind Tambe,et al.  GUARDS: game theoretic security allocation on a national scale , 2011, AAMAS.

[7]  Manish Jain,et al.  Software Assistants for Randomized Patrol Planning for the LAX Airport Police and the Federal Air Marshal Service , 2010, Interfaces.

[8]  Vincent Conitzer,et al.  Complexity of Computing Optimal Stackelberg Strategies in Security Resource Allocation Games , 2010, AAAI.

[9]  Rong Yang,et al.  Improving Resource Allocation Strategy against Human Adversaries in Security Games , 2011, IJCAI.


[11]  Stephen P. Boyd,et al.  Convex Optimization , 2004, IEEE Transactions on Automatic Control.

[12]  Philip A. Haile,et al.  On the Empirical Content of Quantal Response Equilibrium , 2003 .

[13]  Jorge Nocedal,et al.  Knitro: An Integrated Package for Nonlinear Optimization , 2006 .

[14]  Bo An,et al.  PROTECT: a deployed game theoretic system to protect the ports of the United States , 2012, AAMAS.

[15]  R. McKelvey,et al.  Quantal Response Equilibria for Normal Form Games , 1995 .

[16]  Vincent Conitzer,et al.  Stackelberg vs. Nash in Security Games: An Extended Investigation of Interchangeability, Equivalence, and Uniqueness , 2011, J. Artif. Intell. Res..

[17]  ปิยดา สมบัติวัฒนา Behavioral Game Theory: Experiments in Strategic Interaction , 2013 .

[18]  Kevin Leyton-Brown,et al.  Beyond equilibrium: predicting human behaviour in normal form games , 2010, AAAI.