Decision-Focused Learning of Adversary Behavior in Security Games

Stackelberg security games are a critical tool for maximizing the utility of limited defense resources to protect important targets from an intelligent adversary. Motivated by green security, where the defender may only observe an adversary's response to defense on a limited set of targets, we study the problem of defending against the same adversary on a larger set of targets from the same distribution. We give a theoretical justification for why standard two-stage learning approaches, where a model of the adversary is trained for predictive accuracy and then optimized against, may fail to maximize the defender's expected utility in this setting. We develop a decision-focused learning approach, where the adversary behavior model is optimized for decision quality, and show empirically that it achieves higher defender expected utility than the two-stage approach when there is limited training data and a large number of target features.

[1]  Amos Azaria,et al.  Analyzing the Effectiveness of Adversary Modeling in Security Games , 2013, AAAI.

[2]  J. Zico Kolter,et al.  Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games , 2019, AAAI.

[3]  Milind Tambe,et al.  When Security Games Go Green: Designing Defender Strategies to Prevent Poaching and Illegal Fishing , 2015, IJCAI.

[4]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[5]  Branislav Bosanský,et al.  Comparing Strategic Secrecy and Stackelberg Commitment in Security Games , 2017, IJCAI.

[6]  Milind Tambe,et al.  Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization , 2018, AAAI.

[7]  Rong Yang,et al.  Improving Resource Allocation Strategy against Human Adversaries in Security Games , 2011, IJCAI.

[8]  Milind Tambe,et al.  Keeping Pace with Criminals: Designing Patrol Allocation Against Adaptive Opportunistic Criminals , 2015, AAMAS.

[9]  Kevin Leyton-Brown,et al.  Predicting human behavior in unrepeated, simultaneous-move games , 2013, Games Econ. Behav..

[10]  Nicola Basilico,et al.  Adversarial patrolling with spatially uncertain alarm signals , 2015, Artif. Intell..

[11]  Milind Tambe,et al.  Learning Adversary Behavior in Security Games: A PAC Model Perspective , 2015, AAMAS.

[12]  Steven Okamoto,et al.  Solving non-zero sum multiagent network flow security games with attack costs , 2012, AAMAS.

[13]  Priya L. Donti,et al.  Task-based End-to-end Model Learning in Stochastic Optimization , 2017, NIPS.

[14]  Nicole D. Sintov,et al.  Adversaries Wising Up : Modeling Heterogeneity and Dynamics of Behavior , 2016 .

[15]  J. Zico Kolter,et al.  What game are we playing? End-to-end learning in normal and extensive form games , 2018, IJCAI.

[16]  Milind Tambe,et al.  Three Strategies to Success: Learning Adversary Models in Security Games , 2016, IJCAI.

[17]  Milind Tambe,et al.  Adversary Models Account for Imperfect Crime Data: Forecasting and Planning against Real-world Poachers , 2018, AAMAS.

[18]  Milind Tambe,et al.  Security and Game Theory - Algorithms, Deployed Systems, Lessons Learned , 2011 .

[19]  Richard S. John,et al.  Empirical Comparisons of Descriptive Multi-objective Adversary Models in Stackelberg Security Games , 2014, GameSec.

[20]  Bo An,et al.  Game-Theoretic Resource Allocation for Protecting Large Public Events , 2014, AAAI.

[21]  Milind Tambe,et al.  Beware the Soothsayer: From Attack Prediction Accuracy to Predictive Reliability in Security Games , 2015, GameSec.

[22]  Haifeng Xu,et al.  The Mysteries of Security Games: Equilibrium Computation Becomes Combinatorial Algorithm Design , 2016, EC.

[23]  Branislav Bosanský,et al.  Using Correlated Strategies for Computing Stackelberg Equilibria in Extensive-Form Games , 2016, AAAI.

[24]  Rong Yang,et al.  Adaptive resource allocation for wildlife protection against illegal poachers , 2014, AAMAS.

[25]  R. McKelvey,et al.  Quantal Response Equilibria for Normal Form Games , 1995 .

[26]  J. Zico Kolter,et al.  OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[27]  Milind Tambe,et al.  Comparing human behavior models in repeated Stackelberg security games: An extended study , 2016, Artif. Intell..

[28]  Vincent Conitzer,et al.  Solving Stackelberg games with uncertain observability , 2011, AAMAS.

[29]  Nicola Basilico,et al.  Patrolling security games: Definition and algorithms for solving large instances with single patroller and single intruder , 2012, Artif. Intell..

[30]  Kevin Leyton-Brown,et al.  Deep Learning for Predicting Human Strategic Behavior , 2016, NIPS.