Combining Online Learning and Equilibrium Computation in Security Games

Game-theoretic analysis has emerged as an important method for making resource allocation decisions in both infrastructure protection and cyber security domains. However, static equilibrium models defined based on inputs from domain experts have weaknesses; they can be inaccurate, and they do not adapt over time as the situation (and adversary) evolves. In cases where there are frequent interactions with an attacker, using learning to adapt to an adversary revealed behavior may lead to better solutions in the long run. However, learning approaches need a lot of data, may perform poorly at the start, and may not be able to take advantage of expert analysis. We explore ways to combine equilibrium analysis with online learning methods with the goal of gaining the advantages of both approaches. We present several hybrid methods that combine these techniques in different ways, and empirically evaluated the performance of these methods in a game that models a border patrolling scenario.

[1]  Manish Jain,et al.  Computing optimal randomized resource allocations for massive security games , 2009, AAMAS 2009.

[2]  Juliane Hahn,et al.  Security And Game Theory Algorithms Deployed Systems Lessons Learned , 2016 .

[3]  Viliam Lisý,et al.  Online Learning Methods for Border Patrol Resource Allocation , 2014, GameSec.

[4]  Alexandre Proutière,et al.  Stochastic and Adversarial Combinatorial Bandits , 2015, ArXiv.

[5]  Vincent Conitzer,et al.  Stackelberg vs. Nash in Security Games: An Extended Investigation of Interchangeability, Equivalence, and Uniqueness , 2011, J. Artif. Intell. Res..

[6]  Vincent Conitzer,et al.  Stackelberg vs. Nash in security games: interchangeability, equivalence, and uniqueness , 2010, AAMAS 2010.

[7]  Milind Tambe,et al.  Urban security: game-theoretic resource allocation in networked physical domains , 2010, AAAI 2010.

[8]  Milind Tambe,et al.  Stop the compartmentalization: unified robust algorithms for handling uncertainties in security games , 2014, AAMAS.

[9]  Bo An,et al.  Security games with surveillance cost and optimal timing of attack execution , 2013, AAMAS.

[10]  Manish Jain,et al.  Risk-Averse Strategies for Security Games with Execution and Observational Uncertainty , 2011, AAAI.

[11]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[12]  Milind Tambe,et al.  Approximation methods for infinite Bayesian Stackelberg games: modeling distributional payoff uncertainty , 2011, AAMAS.

[13]  Djallel Bouneffouf,et al.  Finite-time analysis of the multi-armed bandit problem with known trend , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[14]  Bo An,et al.  Security Games with Limited Surveillance , 2012, AAAI.

[15]  Rong Yang,et al.  Adaptive resource allocation for wildlife protection against illegal poachers , 2014, AAMAS.

[16]  Sarit Kraus,et al.  ARMOR Security for Los Angeles International Airport , 2008, AAAI.

[17]  Eric Moulines,et al.  On Upper-Confidence Bound Policies for Switching Bandit Problems , 2011, ALT.

[18]  Rong Yang,et al.  Improving resource allocation strategies against human adversaries in security games: An extended study , 2013, Artif. Intell..

[19]  Vladik Kreinovich,et al.  Efficient Approximation for Security Games with Interval Uncertainty , 2012, AAAI Spring Symposium: Game Theory for Security, Sustainability, and Health.

[20]  Y. Freund,et al.  The non-stochastic multi-armed bandit problem , 2001 .

[21]  Vincent Conitzer,et al.  Stackelberg vs. Nash in security games: interchangeability, equivalence, and uniqueness , 2010, AAMAS.

[22]  Maria-Florina Balcan,et al.  Commitment Without Regrets: Online Learning in Stackelberg Security Games , 2015, EC.

[23]  Michael J. Coursey U.S. Customs and Border Protection , 2003 .

[24]  Milind Tambe,et al.  Keeping Pace with Criminals: Designing Patrol Allocation Against Adaptive Opportunistic Criminals , 2015, AAMAS.

[25]  Rong Yang,et al.  A robust approach to addressing human adversaries in security games , 2012, AAMAS.

[26]  Michael H. Bowling,et al.  Decision-Theoretic Clustering of Strategies , 2015, AAAI Workshop: Computer Poker and Imperfect Information.

[27]  Milind Tambe,et al.  Security and Game Theory: IRIS – A Tool for Strategic Security Allocation in Transportation Networks , 2011, AAMAS 2011.

[28]  Bo An,et al.  PROTECT: a deployed game theoretic system to protect the ports of the United States , 2012, AAMAS.

[29]  Peter I. Cowling,et al.  Information Set Monte Carlo Tree Search , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[30]  Ariel D. Procaccia,et al.  Lazy Defenders Are Almost Optimal against Diligent Attackers , 2014, AAAI.

[31]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[32]  Michael H. Bowling,et al.  Online implicit agent modelling , 2013, AAMAS.

[33]  Sarit Kraus,et al.  Robust solutions to Stackelberg games: Addressing bounded rationality and limited observations in human cognition , 2010, Artif. Intell..