Dynamic policy-based IDS configuration

Intrusion Detection System (IDS) is an important security enforcement tool in modern networked information systems. Obtaining an optimal IDS configuration for effective detection of attacks is far from trivial. There exists a tradeoff between security enforcement levels and the performance of information systems. It is critical to configure an IDS in a dynamic and iterative fashion to balance the security overhead and system performance. In this paper, we use noncooperative game approaches to address this problem. We first build a fundamental game framework to model the zero-sum interactions between the detector and the attacker. Building on this platform, we then formulate a stochastic game model in which the transitions between system states are determined by the actions chosen by both players. An optimal policy-based configuration can be found by minimizing a discounted cost criterion, using an iterative method. In addition, we propose a Q-learning algorithm to find the optimal game values when the transitions between system states are unknown. We show the convergence of the algorithm to the optimal Q-function and illustrate the concepts by simulation.

[1]  Svein J. Knapskog,et al.  Real-Time Risk Assessment with Network Sensors and Intrusion Detection Systems , 2005, CIS.

[2]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[3]  Tyler Moore,et al.  The Iterated Weakest Link - A Model of Adaptive Security Investment , 2016, WEIS.

[4]  J. Filar,et al.  Competitive Markov Decision Processes , 1996 .

[5]  Tyler Moore,et al.  The iterated weakest link , 2010, IEEE Security & Privacy.

[6]  Marc Dacier,et al.  Towards a taxonomy of intrusion-detection systems , 1999, Comput. Networks.

[7]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[8]  T. Basar,et al.  A game theoretic approach to decision and analysis in network intrusion detection , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[9]  Michael L. Littman,et al.  Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.

[10]  T. E. S. Raghavan,et al.  Algorithms for stochastic games — A survey , 1991, ZOR Methods Model. Oper. Res..

[11]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[12]  Raouf Boutaba,et al.  Policy-based Management: A Historical Perspective , 2007, Journal of Network and Systems Management.

[13]  Lambert Schaelicke,et al.  Characterizing the Performance of Network Intrusion Detection Sensors , 2003, RAID.

[14]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[15]  Chris Cannings,et al.  Stochastic Games and Related Topics , 1991 .

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[18]  Raouf Boutaba,et al.  Policy-Based Security Configuration Management, Application to Intrusion Detection and Prevention , 2009, 2009 IEEE International Conference on Communications.

[19]  T. Başar,et al.  An Intrusion Detection Game with Limited Observations , 2005 .

[20]  Yanyan Yang,et al.  Policy management for network-based intrusion detection and prevention , 2004, 2004 IEEE/IFIP Network Operations and Management Symposium (IEEE Cat. No.04CH37507).

[21]  Svein J. Knapskog,et al.  On Stochastic Modeling for Integrated Security and Dependability Evaluation , 2006, J. Networks.

[22]  Csaba Szepesvári,et al.  A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.