Three Strategies to Success: Learning Adversary Models in Security Games

State-of-the-art applications of Stackelberg security games -- including wildlife protection -- offer a wealth of data, which can be used to learn the behavior of the adversary. But existing approaches either make strong assumptions about the structure of the data, or gather new data through online algorithms that are likely to play severely suboptimal strategies. We develop a new approach to learning the parameters of the behavioral model of a bounded rational attacker (thereby pinpointing a near optimal strategy), by observing how the attacker responds to only three defender strategies. We also validate our approach using experiments on real and synthetic data.

[1]  Milind Tambe,et al.  Learning Adversary Behavior in Security Games: A PAC Model Perspective , 2016, AAMAS.

[2]  Walter Galitschi,et al.  On inverses of Vandermonde and confluent Vandermonde matrices , 1962 .

[3]  Milind Tambe,et al.  When Security Games Go Green: Designing Defender Strategies to Prevent Poaching and Illegal Fishing , 2015, IJCAI.

[4]  Debby Cox,et al.  Snare Removal for Conservation of Chimpanzees in the Kalinzu Forest Reserve, Uganda , 2007 .

[5]  D. McFadden Quantal Choice Analysis: A Survey , 1976 .

[6]  Milind Tambe,et al.  "A Game of Thrones": When Human Behavior Models Compete in Repeated Stackelberg Security Games , 2015, AAMAS.

[7]  Vincent Conitzer,et al.  Learning and Approximating the Optimal Strategy to Commit To , 2009, SAGT.

[8]  R. Luce,et al.  Individual Choice Behavior: A Theoretical Analysis. , 1960 .

[9]  R. Duncan Luce,et al.  Individual Choice Behavior: A Theoretical Analysis , 1979 .

[10]  Rong Yang,et al.  Adaptive resource allocation for wildlife protection against illegal poachers , 2014, AAMAS.

[11]  Milind Tambe Security and Game Theory - Algorithms, Deployed Systems, Lessons Learned , 2011 .

[12]  Walter Gautschi,et al.  On inverses of Vandermonde and confluent Vandermonde matrices. II , 1963 .

[13]  Noa Agmon,et al.  Making the Most of Our Regrets: Regret-Based Solutions to Handle Payoff Uncertainty and Elicitation in Green Security Games , 2015, GameSec.

[14]  Csaba Szepesvári,et al.  Performance of Nonlinear Approximate Adaptive Controllers , 2003 .

[15]  Ariel D. Procaccia,et al.  Learning Optimal Commitment to Overcome Insecurity , 2014, NIPS.

[16]  Sarit Kraus,et al.  Using Game Theory for Los Angeles Airport Security , 2009, AI Mag..

[17]  Maria-Florina Balcan,et al.  Commitment Without Regrets: Online Learning in Stackelberg Security Games , 2015, EC.

[18]  Milind Tambe,et al.  Robust Protection of Fisheries with COmPASS , 2014, AAAI.

[19]  Gerald Tesauro,et al.  Playing repeated Stackelberg games with unknown opponents , 2012, AAMAS.

[20]  Amos Azaria,et al.  Analyzing the Effectiveness of Adversary Modeling in Security Games , 2013, AAAI.