论文信息 - Stochastic Dynamic Information Flow Tracking Game with Reinforcement Learning

Stochastic Dynamic Information Flow Tracking Game with Reinforcement Learning

Advanced Persistent Threats (APTs) are stealthy, sophisticated, and long-term attacks that impose significant economic costs and violate the security of sensitive information. Data and control flow commands arising from APTs introduce new information flows into the targeted computer system. Dynamic Information Flow Tracking (DIFT) is a promising detection mechanism against APTs that taints suspicious input sources in the system and authenticates the tainted flows at certain processes according to a well defined security policy. Employing DIFT to defend against APTs in large scale cyber systems is restricted due to the heavy resource and performance overhead introduced on the system. The objective of this paper is to model resource efficient DIFT that successfully detect APTs. We develop a game-theoretic framework and provide an analytical model of DIFT that enables the study of trade-off between resource efficiency and the quality of detection in DIFT. Our proposed infinite-horizon, nonzero-sum, stochastic game captures the performance parameters of DIFT such as false alarms and false-negatives and considers an attacker model where the APT can relaunch the attack if it fails in a previous attempt and thereby continuously engage in threatening the system. We assume some of the performance parameters of DIFT are unknown. We propose a model-free reinforcement learning algorithm that converges to a Nash equilibrium of the discounted stochastic game between APT and DIFT. We execute and evaluate the proposed algorithm on a real-world nation state attack dataset.

[1] Quanyan Zhu,et al. Network Security Configurations: A Nonzero-Sum Stochastic Game Approach , 2010, Proceedings of the 2010 American Control Conference.

[2] Tansu Alpcan,et al. Stochastic games for security in networks with interdependent nodes , 2009, 2009 International Conference on Game Theory for Networks.

[3] Alessandro Orso,et al. Dytan: a generic dynamic taint analysis framework , 2007, ISSTA '07.

[4] Nicolas Vieille,et al. Two-player stochastic games II: The case of recursive games , 2000 .

[5] Manuela M. Veloso,et al. Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.

[6] Radha Poovendran,et al. DIFT Games: Dynamic Information Flow Tracking Games for Advanced Persistent Threats , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[7] Bryan Watkins. The Impact of Cyber Attacks on the Private Sector , 2014 .

[8] Byung-Gon Chun,et al. TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[9] Julian Jang,et al. A survey of emerging threats in cybersecurity , 2014, J. Comput. Syst. Sci..

[10] J. Filar,et al. Competitive Markov Decision Processes , 1996 .

[11] R. Amir. STOCHASTIC GAMES IN ECONOMICS AND RELATED FIELDS: AN OVERVIEW , 2001 .

[12] Alessandro Orso,et al. RAIN: Refinable Attack Investigation with On-demand Inter-Process Information Flow Tracking , 2017, CCS.

[13] V. Borkar. Stochastic approximation with two time scales , 1997 .

[14] Shalabh Bhatnagar,et al. Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games , 2015, AAMAS.

[15] Quanyan Zhu,et al. Adaptive Strategic Cyber Defense for Advanced Persistent Threats in Critical Infrastructure Networks , 2018, PERV.

[16] Jeannette M. Wing,et al. Game strategies in network security , 2005, International Journal of Information Security.

[17] Valérie Viet Triem Tong,et al. TerminAPTor: Highlighting Advanced Persistent Threats through Information Flow Tracking , 2016, 2016 8th IFIP International Conference on New Technologies, Mobility and Security (NTMS).

[18] David Zhang,et al. Secure program execution via dynamic information flow tracking , 2004, ASPLOS XI.

[19] Radha Poovendran,et al. A Game Theoretical Framework for Inter-process Adversarial Intervention Detection , 2018, GameSec.

[20] H. Robbins. A Stochastic Approximation Method , 1951 .

[21] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[22] Levente Buttyán,et al. The Cousins of Stuxnet: Duqu, Flame, and Gauss , 2012, Future Internet.

[23] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.

[24] Christopher Krügel,et al. Cross Site Scripting Prevention with Dynamic Data Tainting and Static Analysis , 2007, NDSS.

[25] Radha Poovendran,et al. A Game Theoretic Approach for Dynamic Information Flow Tracking with Conditional Branching , 2019, 2019 American Control Conference (ACC).

[26] Jason Flinn,et al. Parallelizing security checks on commodity hardware , 2008, ASPLOS.

[27] Quanyan Zhu,et al. Robust and resilient control design for cyber-physical systems with an application to power systems , 2011, IEEE Conference on Decision and Control and European Control Conference.

[28] Idit Keidar,et al. GPUfs: integrating a file system with GPUs , 2014, ASPLOS '13.

[29] Angelos D. Keromytis,et al. ShadowReplica: efficient parallelization of dynamic data flow tracking , 2013, CCS.

[30] Radha Poovendran,et al. Multi-stage Dynamic Information Flow Tracking Game , 2018, GameSec.