论文信息 - Counterfactual regret minimization for integrated cyber and air defense resource allocation

Counterfactual regret minimization for integrated cyber and air defense resource allocation

Abstract This research presents a new application of optimal and approximate solution techniques to solve resource allocation problems with imperfect information in the cyber and air-defense domains. We develop a two-player, zero-sum, extensive-form game to model attacker and defender roles in both physical and cyber space. We reformulate the problem to find a Nash equilibrium using an efficient, sequence-form linear program. Solving this linear program produces optimal defender strategies for the multi-domain security game. We address large problem instances with an application of the approximate counterfactual regret minimization algorithm. This approximation reduces computation time by 95% while maintaining an optimality gap of less than 3%. Our application of discounted counterfactual regret results in a further 36% reduction in computation time from the base algorithm. We develop domain insights through a designed experiment to explore the parameter space of the problem and algorithm. We also address robust opponent exploitation by combining existing techniques to extend the counterfactual regret algorithm to include a discounted, constrained variant. A comparison of robust linear programming, data-biased response, and constrained counterfactual regret approaches clarifies trade-offs between exploitation and exploitability for each method. The robust linear programming approach is the most effective, producing an exploitation to exploitability ratio of 10.8 to 1.

Darryl K. Ahner | Andrew J. Keith | Andrew Keith | D. Ahner

[1] Vincent Conitzer,et al. Security scheduling for real-world networks , 2013, AAMAS.

[2] Milind Tambe,et al. Robust Protection of Fisheries with COmPASS , 2014, AAAI.

[3] Alan Edelman,et al. Julia: A Fresh Approach to Numerical Computing , 2014, SIAM Rev..

[4] Demosthenis Teneketzis,et al. A POMDP Approach to the Dynamic Defense of Large-Scale Cyber Networks , 2018, IEEE Transactions on Information Forensics and Security.

[5] Vincent Conitzer,et al. Stackelberg vs. Nash in Security Games: An Extended Investigation of Interchangeability, Equivalence, and Uniqueness , 2011, J. Artif. Intell. Res..

[6] Milind Tambe,et al. Stop the compartmentalization: unified robust algorithms for handling uncertainties in security games , 2014, AAMAS.

[7] Steve Alpern,et al. Patrolling Games , 2011, Oper. Res..

[8] Darryl K. Ahner,et al. Optimal multi-stage allocation of weapons to targets using adaptive dynamic programming , 2015, Optim. Lett..

[9] Erim Kardes. On discounted stochastic games with incomplete information on payoffs and a security application , 2014, Oper. Res. Lett..

[10] Darryl K. Ahner,et al. The Weapon-Target Assignment Problem , 2019, Comput. Oper. Res..

[11] Brian J. Lunday,et al. Heterogeneous surface-to-air missile defense battery location: a game theoretic approach , 2017, J. Heuristics.

[12] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .

[13] Branislav Bosanský,et al. An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information , 2014, J. Artif. Intell. Res..

[14] Michael H. Bowling,et al. Data Biased Robust Counter Strategies , 2009, AISTATS.

[15] Brian J. Lunday,et al. A Game Theoretic Model for the Optimal Location of Integrated Air Defense System Missile Batteries , 2016, INFORMS J. Comput..

[16] Yongchao Liu,et al. Distributionally robust equilibrium for continuous games: Nash and Stackelberg models , 2018, Eur. J. Oper. Res..

[17] Yurii Nesterov,et al. Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[18] Michael H. Bowling,et al. Computing Robust Counter-Strategies , 2007, NIPS.

[19] Hervé Debar,et al. A Survey on Game-Theoretic Approaches for Intrusion Detection and Response Optimization , 2018, ACM Comput. Surv..

[20] Dimitris Bertsimas,et al. Robust game theory , 2006, Math. Program..

[21] Michael H. Bowling,et al. Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization , 2012, AAMAS.

[22] Michael H. Bowling,et al. Solving Heads-Up Limit Texas Hold'em , 2015, IJCAI.

[23] Frederick R. Forst,et al. On robust estimation of the location parameter , 1980 .

[24] Warren B. Powell,et al. “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.

[25] Michael N. Gagnon,et al. Towards Net-Centric Cyber Survivability for Ballistic Missile Defense , 2010, ISARCS.

[26] Sam Ganzfried,et al. Bayesian Opponent Exploitation in Imperfect-Information Games , 2016, 2018 IEEE Conference on Computational Intelligence and Games (CIG).

[27] Tuomas Sandholm,et al. Solving Large Sequential Games with the Excessive Gap Technique , 2018, NeurIPS.

[28] Pasquale Malacaria,et al. Scalable min-max multi-objective cyber-security optimisation over probabilistic attack graphs , 2019, Eur. J. Oper. Res..

[29] Michael G H Bell,et al. Attacker-defender model against quantal response adversaries for cyber security in logistics management: An introductory study , 2019, Eur. J. Oper. Res..

[30] Michael H. Bowling,et al. Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[31] M. Buffelli,et al. Perinatal switch from synchronous to asynchronous activity of motoneurons: Link with synapse elimination , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[32] R. Lougee-Heimer,et al. The Common Optimization INterface for Operations Research: Promoting open-source software in the operations research community , 2003 .

[33] Michael H. Bowling,et al. Counterfactual Regret Minimization in Sequential Security Games , 2016, AAAI.

[34] Milind Tambe,et al. Addressing Behavioral Uncertainty in Security Games: An Efficient Robust Strategic Solution for Defender Patrols , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[35] Milind Tambe,et al. From physical security to cybersecurity , 2015, J. Cybersecur..

[36] Richard F. Deckro,et al. Simulating attacker and defender strategies within a dynamic game on network topology , 2018, J. Simulation.

[37] Marc Lanctot,et al. Computing Approximate Nash Equilibria and Robust Best-Responses Using Sampling , 2011, J. Artif. Intell. Res..

[38] Darryl K. Ahner,et al. Real-time heuristic algorithms for the static weapon target assignment problem , 2018, J. Heuristics.

[39] Tuomas Sandholm,et al. Safe Opponent Exploitation , 2015, ACM Trans. Economics and Comput..

[40] Milind Tambe,et al. Trends and Applications in Stackelberg Security Games , 2018 .

[41] Oualid Jouini,et al. Distributionally robust chance-constrained games: existence and characterization of Nash equilibrium , 2016, Optimization Letters.

[42] Claude Mirodatos,et al. Hydrogen production from crude pyrolysis oil by a sequential catalytic process , 2007 .

[43] Raymond R. Hill,et al. A bilevel exposure-oriented sensor location problem for border security , 2018, Comput. Oper. Res..

[44] Brian J. Lunday,et al. Approximate dynamic programming for missile defense interceptor fire control , 2017, Eur. J. Oper. Res..