Deep Reinforcement Learning based Adaptive Moving Target Defense

Moving target defense (MTD) is a proactive defense approach that aims to thwart attacks by continuously changing the attack surface of a system (e.g., changing host or network configurations), thereby increasing the adversary's uncertainty and attack cost. To maximize the impact of MTD, a defender must strategically choose when and what changes to make, taking into account both the characteristics of its system as well as the adversary's observed activities. Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender's actions. In this paper, we propose finding optimal MTD strategies using deep reinforcement learning. Based on an established model of adaptive MTD, we formulate finding an MTD strategy as finding a policy for a partially-observable Markov decision process. To significantly improve training performance, we introduce compact memory representations. To demonstrate our approach, we provide thorough numerical results, showing significant improvement over existing strategies.

[1]  R. Bhosale,et al.  Cooperative Machine Learning For Intrusion Detection System , 2014 .

[2]  N. Le Fort-Piat,et al.  The world of independent learners is not markovian , 2011, Int. J. Knowl. Based Intell. Eng. Syst..

[3]  Akbar Siami Namin,et al.  Markov Decision Process to Enforce Moving Target Defence Policies , 2019, ArXiv.

[4]  Vijay Janapa Reddi,et al.  Deep Reinforcement Learning for Cyber Security , 2019, IEEE transactions on neural networks and learning systems.

[5]  Sushil Jajodia,et al.  Adversarial and Uncertain Reasoning for Adaptive Cyber Defense: Building the Scientific Foundation , 2014, ICISS.

[6]  Cheng Lei,et al.  Optimal strategy selection approach to moving target defense based on Markov robust game , 2019, Comput. Secur..

[7]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[8]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[9]  Álvaro Herrero,et al.  Multiagent Systems for Network Intrusion Detection: A Review , 2009, CISIS.

[10]  Robert Wilson,et al.  A global Newton method to compute Nash equilibria , 2003, J. Econ. Theory.

[11]  Sailik Sengupta,et al.  Multi-agent Reinforcement Learning in Bayesian Stackelberg Markov Games for Adaptive Moving Target Defense , 2020, ArXiv.

[12]  Honglak Lee,et al.  Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.

[13]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[14]  Sailik Sengupta,et al.  A Game Theoretic Approach to Strategy Generation for Moving Target Defense in Web Applications , 2017, AAMAS.

[15]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[16]  David Silver,et al.  A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.

[17]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[18]  Alina Oprea,et al.  Playing Adaptively Against Stealthy Opponents: A Reinforcement Learning Strategy for the FlipIt Security Game , 2019, ArXiv.

[19]  Tsuyoshi Murata,et al.  {m , 2020, ACML.

[20]  Avrim Blum,et al.  Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.

[21]  Zizhan Zheng,et al.  Optimal Timing of Moving Target Defense: A Stackelberg Game Model , 2019, MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM).

[22]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[23]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[24]  Yoav Shoham,et al.  Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2008 .

[25]  Sam Devlin,et al.  Distributed reinforcement learning for adaptive and robust network intrusion response , 2015, Connect. Sci..

[26]  George Cybenko,et al.  Moving Target Defense Quantification , 2019, Adversarial and Uncertain Reasoning for Adaptive Cyber Defense.

[27]  Liang Tong,et al.  Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning , 2020, AAAI.

[28]  Ioana Banicescu,et al.  A Performance Evaluation of Deep Reinforcement Learning for Model-Based Intrusion Response , 2019, 2019 IEEE 4th International Workshops on Foundations and Applications of Self* Systems (FAS*W).

[30]  Shahaboddin Shamshirband,et al.  Cooperative game theoretic approach using fuzzy Q-learning for detecting and preventing intrusions in wireless sensor networks , 2014, Eng. Appl. Artif. Intell..

[31]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[32]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[33]  Andrew McLennan,et al.  Gambit: Software Tools for Game Theory , 2006 .

[34]  Daniel Kudenko,et al.  Distributed response to network intrusions using multiagent reinforcement learning , 2015, Eng. Appl. Artif. Intell..

[35]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[36]  Peng Liu,et al.  A Practical Approach for Adaptive Data Structure Layout Randomization , 2015, ESORICS.

[37]  Cheng Lei,et al.  Optimal Strategy Selection for Moving Target Defense Based on Markov Game , 2017, IEEE Access.

[38]  Michael P. Wellman,et al.  Empirical Game-Theoretic Analysis for Moving Target Defense , 2015, MTD@CCS.

[39]  Zhisheng Hu,et al.  Reinforcement Learning for Adaptive Cyber Defense Against Zero-Day Attacks , 2019, Adversarial and Uncertain Reasoning for Adaptive Cyber Defense.

[40]  Michael P. Wellman,et al.  Moving Target Defense against DDoS Attacks: An Empirical Game-Theoretic Analysis , 2016, MTD@CCS.