Effective solutions for real-world Stackelberg games: when agents must deal with human uncertainties

How do we build multiagent algorithms for agent interactions with human adversaries? Stackelberg games are natural models for many important applications that involve human interaction, such as oligopolistic markets and security domains. In Stackelberg games, one player, the leader, commits to a strategy and the follower makes their decision with knowledge of the leader's commitment. Existing algorithms for Stackelberg games efficiently find optimal solutions (leader strategy), but they critically assume that the follower plays optimally. Unfortunately, in real-world applications, agents face human followers (adversaries) who --- because of their bounded rationality and limited observation of the leader strategy --- may deviate from their expected optimal response. Not taking into account these likely deviations when dealing with human adversaries can cause an unacceptable degradation in the leader's reward, particularly in security applications where these algorithms have seen real-world deployment. To address this crucial problem, this paper introduces three new mixed-integer linear programs (MILPs) for Stackelberg games to consider human adversaries, incorporating: (i) novel anchoring theories on human perception of probability distributions and (ii) robustness approaches for MILPs to address human imprecision. Since these new approaches consider human adversaries, traditional proofs of correctness or optimality are insufficient; instead, it is necessary to rely on empirical validation. To that end, this paper considers two settings based on real deployed security systems, and compares 6 different approaches (three new with three previous approaches), in 4 different observability conditions, involving 98 human subjects playing 1360 games in total. The final conclusion was that a model which incorporates both the ideas of robustness and anchoring achieves statistically significant better rewards and also maintains equivalent or faster solution speeds compared to existing approaches.

[1]  Ariel Orda,et al.  Achieving network optima using Stackelberg routing strategies , 1997, TNET.

[2]  Sarit Kraus,et al.  Deployed ARMOR protection: the application of a game theoretic model for security at the Los Angeles International Airport , 2008, AAMAS 2008.

[3]  Sarit Kraus,et al.  Playing games for security: an efficient exact algorithm for solving Bayesian Stackelberg games , 2008, AAMAS.

[4]  B. Stengel,et al.  Leadership with commitment to mixed strategies , 2004 .

[5]  Sarit Kraus,et al.  Security in multiagent systems by policy randomization , 2006, AAMAS '06.

[6]  Kelly E. See,et al.  Between ignorance and truth: Partition dependence and learning in judgment under uncertainty. , 2006, Journal of experimental psychology. Learning, memory, and cognition.

[7]  Dimitris Bertsimas,et al.  Robust game theory , 2006, Math. Program..

[8]  H. Simon,et al.  Rational choice and the structure of the environment. , 1956, Psychological review.

[9]  A. Tversky,et al.  Support theory: A nonextensional representation of subjective probability. , 1994 .

[10]  Fernando Ordóñez,et al.  Robust Wardrop Equilibrium , 2007, NET-COOP.

[11]  Sarit Kraus,et al.  An efficient heuristic approach for security against multiple adversaries , 2007, AAMAS '07.

[12]  Sui Ruan,et al.  Patrolling in a Stochastic Environment , 2005 .

[13]  Vincent Conitzer,et al.  Computing the optimal strategy to commit to , 2006, EC '06.

[14]  Manish Jain,et al.  Computing optimal randomized resource allocations for massive security games , 2009, AAMAS 2009.

[15]  Gerald G. Brown,et al.  Defending Critical Infrastructure , 2006, Interfaces.

[16]  Herbert A. Simon,et al.  The Sciences of the Artificial , 1970 .

[17]  RICHARD C. LARSON,et al.  A hypercube queuing model for facility location and redistricting in urban emergency services , 1974, Comput. Oper. Res..

[18]  R. Selten,et al.  A Generalized Nash Solution for Two-Person Bargaining Games with Incomplete Information , 1972 .

[19]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[20]  K. Yuen,et al.  The two-sample trimmed t for unequal population variances , 1974 .

[21]  Laurent El Ghaoui,et al.  Robustness in Markov Decision Problems with Uncertain Transition Matrices , 2003, NIPS.

[22]  A. Rubinstein Modeling Bounded Rationality , 1998 .

[23]  Jean Cardinal,et al.  Pricing of Geometric Transportation Networks , 2009, CCCG.

[24]  Sarit Kraus,et al.  The impact of adversarial knowledge on adversarial planning in perimeter patrol , 2008, AAMAS.