Comparing human behavior models in repeated Stackelberg security games: An extended study

Abstract Several competing human behavior models have been proposed to model boundedly rational adversaries in repeated Stackelberg Security Games (SSG). However, these existing models fail to address three main issues which are detrimental to defender performance. First, while they attempt to learn adversary behavior models from adversaries' past actions (“attacks on targets”), they fail to take into account adversaries' future adaptation based on successes or failures of these past actions. Second, existing algorithms fail to learn a reliable model of the adversary unless there exists sufficient data collected by exposing enough of the attack surface – a situation that often arises in initial rounds of the repeated SSG. Third, current leading models have failed to include probability weighting functions, even though it is well known that human beings' weighting of probability is typically nonlinear. To address these limitations of existing models, this article provides three main contributions. Our first contribution is a new human behavior model, SHARP, which mitigates these three limitations as follows: (i) SHARP reasons based on success or failure of the adversary's past actions on exposed portions of the attack surface to model adversary adaptivity; (ii) SHARP reasons about similarity between exposed and unexposed areas of the attack surface, and also incorporates a discounting parameter to mitigate adversary's lack of exposure to enough of the attack surface; and (iii) SHARP integrates a non-linear probability weighting function to capture the adversary's true weighting of probability. Our second contribution is a first “repeated measures study” – at least in the context of SSGs – of competing human behavior models. This study, where each experiment lasted a period of multiple weeks with individual sets of human subjects on the Amazon Mechanical Turk platform, illustrates the strengths and weaknesses of different models and shows the advantages of SHARP. Our third major contribution is to demonstrate SHARP's superiority by conducting real-world human subjects experiments at the Bukit Barisan Seletan National Park in Indonesia against wildlife security experts.

[1]  Jon Elster,et al.  Social Mechanisms: A plea for mechanisms , 1998 .

[2]  Michael H. Bowling,et al.  Data Biased Robust Counter Strategies , 2009, AISTATS.

[3]  Nathalie Etchart-Vincent Probability weighting and the ‘level’ and ‘spacing’ of outcomes: An experimental study over losses , 2009 .

[4]  Anna Woodcock,et al.  Tailored Panel Management , 2014, Evaluation review.

[5]  Rong Yang,et al.  Improving resource allocation strategies against human adversaries in security games: An extended study , 2013, Artif. Intell..

[6]  Harvey Goldstein,et al.  Handling attrition and non-response in longitudinal data , 2009 .

[7]  Charles Hulme,et al.  The Foundations of Spelling Ability: Evidence from a 3-Year Longitudinal Study , 2001 .

[8]  Moses Makonjio Okello,et al.  Correlates of wildlife snaring patterns in Tsavo West National Park, Kenya , 2006 .

[9]  Amos Azaria,et al.  Strategic advice provision in repeated human-agent interactions , 2012, Autonomous Agents and Multi-Agent Systems.

[10]  Rolf Loeber,et al.  Contacting participants for follow-up: how much effort is required to retain participants in longitudinal studies? , 2005 .

[11]  Sarit Kraus,et al.  Robust solutions to Stackelberg games: Addressing bounded rationality and limited observations in human cognition , 2010, Artif. Intell..

[12]  Sushil Jajodia,et al.  Moving Target Defense - Creating Asymmetric Uncertainty for Cyber Threats , 2011, Moving Target Defense.

[13]  H. Zank,et al.  Separating curvature and elevation: A parametric probability weighting function , 2010 .

[14]  K. Arrow,et al.  The New Palgrave Dictionary of Economics , 2020 .

[15]  A. Tversky,et al.  Prospect theory: analysis of decision under risk , 1979 .

[16]  J. Stevens Intertemporal Choice , 2013 .

[17]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[18]  Ariel D. Procaccia,et al.  Learning Optimal Commitment to Overcome Insecurity , 2014, NIPS.

[19]  Maria-Florina Balcan,et al.  Commitment Without Regrets: Online Learning in Stackelberg Security Games , 2015, EC.

[20]  D. McFadden Quantal Choice Analysis: A Survey , 1976 .

[21]  Milind Tambe,et al.  "A Game of Thrones": When Human Behavior Models Compete in Repeated Stackelberg Security Games , 2015, AAMAS.

[22]  Rong Yang,et al.  Adaptive resource allocation for wildlife protection against illegal poachers , 2014, AAMAS.

[23]  Scott Menard,et al.  Handbook of longitudinal research : design, measurement, and analysis , 2008 .

[24]  K. Bagwell Commitment and observability in games , 1995 .

[25]  Jan Beck,et al.  Superstition and belief as inevitable by-products of an adaptive learning strategy , 2007, Human nature.

[26]  Noah Gans,et al.  Simple Models of Discrete Choice and Their Performance in Bandit Experiments , 2007, Manuf. Serv. Oper. Manag..

[27]  E. Hopkins Two Competing Models of How People Learn in Games (first version) , 1999 .

[28]  Sean Hammond,et al.  Research Methods in Psychology: 2nd edition , 2000 .

[29]  Giovanni Seni,et al.  Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions , 2010, Ensemble Methods in Data Mining.

[30]  Rong Yang,et al.  Improving Resource Allocation Strategy against Human Adversaries in Security Games , 2011, IJCAI.

[31]  A. Roth,et al.  Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria , 1998 .

[32]  Jerome P. Reiter,et al.  Handling attrition in longitudinal studies: The case for refreshment samples , 2013, 1306.2791.

[33]  Nicole D. Sintov,et al.  Human Adversaries in Opportunistic Crime Security Games: Evaluating Competing Bounded Rationality Models , 2015 .

[34]  William D. Moreto To conserve and protect: examining law enforcement ranger culture and operations in Queen Elizabeth National Park, Uganda , 2013 .

[35]  Milind Tambe,et al.  Beware the Soothsayer: From Attack Prediction Accuracy to Predictive Reliability in Security Games , 2015, GameSec.

[36]  Krzysztof Z. Gajos,et al.  TurkServer: Enabling Synchronous and Longitudinal Online Experiments , 2012, HCOMP@AAAI.

[37]  Sandra L. Bell,et al.  Attrition in longitudinal studies: who do you lose? , 2006, Australian and New Zealand journal of public health.

[38]  E. Brunswik,et al.  The Conceptual Framework of Psychology , 1954 .

[39]  L. Devenport,et al.  Superstitious bar pressing in hippocampal and septal rats. , 1979, Science.

[40]  Virginia Gil-Rivas,et al.  Nationwide longitudinal study of psychological responses to September 11. , 2002, JAMA.

[41]  Peter McCracken,et al.  Safe Strategies for Agent Modelling in Games , 2004, AAAI Technical Report.

[42]  P. Hedström,et al.  Social Mechanisms: An Analytical Approach to Social Theory. , 1999 .

[43]  Jos Twisk,et al.  Attrition in longitudinal studies. How to deal with missing data. , 2002, Journal of clinical epidemiology.

[44]  David P. Farrington,et al.  Longitudinal-Experimental Studies , 2010 .

[45]  A. Verschoor,et al.  The probability weighting function: experimental evidence from Uganda, India and Ethiopia , 2004 .

[46]  Sarit Kraus,et al.  Playing games for security: an efficient exact algorithm for solving Bayesian Stackelberg games , 2008, AAMAS.

[47]  Philip Leclerc,et al.  Prospect Theory Preferences in Noncooperative Game Theory , 2014 .

[48]  Richard S. John,et al.  Empirical Comparisons of Descriptive Multi-objective Adversary Models in Stackelberg Security Games , 2014, GameSec.

[49]  Prashant Doshi,et al.  On Modeling Human Learning in Sequential Games with Delayed Reinforcements , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[50]  Rong Yang,et al.  A robust approach to addressing human adversaries in security games , 2012, AAMAS.

[51]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[52]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[53]  Milind Tambe,et al.  Security and Game Theory - Algorithms, Deployed Systems, Lessons Learned , 2011 .

[54]  Vincent Conitzer,et al.  Complexity of Computing Optimal Stackelberg Strategies in Security Resource Allocation Games , 2010, AAAI.

[55]  B. Skinner,et al.  The Behavior of Organisms: An Experimental Analysis , 2016 .

[56]  Marc Lanctot,et al.  Computing Approximate Nash Equilibria and Robust Best-Responses Using Sampling , 2011, J. Artif. Intell. Res..

[57]  Maurizio Zollo,et al.  Superstitious Learning with Rare Strategic Decisions: Theory and Evidence from Corporate Acquisitions , 2009, Organ. Sci..

[58]  A. Lemieux Situational prevention of poaching , 2014 .

[59]  G. Heiman,et al.  Research Methods in Psychology , 1994 .

[60]  Alan W. Beggs,et al.  On the convergence of reinforcement learning , 2005, J. Econ. Theory.

[61]  Vincent Conitzer,et al.  Computing the optimal strategy to commit to , 2006, EC '06.

[62]  Roberto Cominetti,et al.  Author's Personal Copy Games and Economic Behavior a Payoff-based Learning Procedure and Its Application to Traffic Games , 2022 .

[63]  Vincent Conitzer,et al.  Learning and Approximating the Optimal Strategy to Commit To , 2009, SAGT.

[64]  Nick Feltovich,et al.  Reinforcement-based vs. Belief-based Learning Models in Experimental Asymmetric-information Games , 2000 .

[65]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[66]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[67]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[68]  Edward V Nunes,et al.  Cognition, commitment language, and behavioral change among cocaine-dependent patients. , 2008, Psychology of addictive behaviors : journal of the Society of Psychologists in Addictive Behaviors.

[69]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[70]  Milind Tambe Security and Game Theory: EFFICIENT ALGORITHMS FOR MASSIVE SECURITY GAMES , 2011 .

[71]  Jeannette M. Wing,et al.  An Attack Surface Metric , 2011, IEEE Transactions on Software Engineering.

[72]  Leif D. Nelson,et al.  Commitment and Behavior Change: Evidence from the Field , 2013 .

[73]  B. Skinner,et al.  Science and human behavior , 1953 .

[74]  Michael H. Bowling,et al.  Computing Robust Counter-Strategies , 2007, NIPS.

[75]  Gerald Tesauro,et al.  Playing repeated Stackelberg games with unknown opponents , 2012, AAMAS.

[76]  Amos Azaria,et al.  Analyzing the Effectiveness of Adversary Modeling in Security Games , 2013, AAAI.

[77]  Rong Yang,et al.  Computing optimal strategy against quantal response in security games , 2012, AAMAS.

[78]  A. Tversky,et al.  Prospect theory: an analysis of decision under risk — Source link , 2007 .

[79]  Milind Tambe,et al.  Robust Protection of Fisheries with COmPASS , 2014, AAAI.

[80]  A. Tversky,et al.  Advances in prospect theory: Cumulative representation of uncertainty , 1992 .

[81]  Richard Gonzalez,et al.  On the Shape of the Probability Weighting Function , 1999, Cognitive Psychology.

[82]  George S. Hammond,et al.  A Correlation of Reaction Rates , 1955 .

[83]  Georges Dionne,et al.  Lottery Decisions and Probability Weighting Function , 2001 .

[84]  Milind Tambe,et al.  Urban security: game-theoretic resource allocation in networked physical domains , 2010, AAAI 2010.

[85]  B. Skinner Superstition in the pigeon. , 1948, Journal of experimental psychology.

[86]  Milind Tambe,et al.  When Security Games Go Green: Designing Defender Strategies to Prevent Poaching and Illegal Fishing , 2015, IJCAI.

[87]  D. Prelec The Probability Weighting Function , 1998 .

[88]  Adam J. Berinsky,et al.  Evaluating Online Labor Markets for Experimental Research: Amazon.com's Mechanical Turk , 2012, Political Analysis.