Off-Policy Q-learning Technique for Intrusion Response in Network Security

With the increasing dependency on our computer devices, we face the necessity of adequate, efficient and effective mechanisms, for protecting our network. There are two main problems that Intrusion Detection Systems (IDS) attempt to solve. 1) To detect the attack, by analyzing the incoming traffic and inspect the network (intrusion detection). 2) To produce a prompt response when the attack occurs (intrusion prevention). It is critical creating an Intrusion detection model that will detect a breach in the system on time and also challenging making it provide an automatic and with an acceptable delay response at every single stage of the monitoring process. We cannot afford to adopt security measures with a high exploiting computational power, and we are not able to accept a mechanism that will react with a delay. In this paper, we will propose an intrusion response mechanism that is based on artificial intelligence, and more precisely, reinforcement learning techniques (RLT). The RLT will help us to create a decision agent, who will control the process of interacting with the undetermined environment. The goal is to find an optimal policy, which will represent the intrusion response, therefore, to solve the Reinforcement learning problem, using a Q-learning approach. Our agent will produce an optimal immediate response, in the process of evaluating the network traffic.This Q-learning approach will establish the balance between exploration and exploitation and provide a unique, self-learning and strategic artificial intelligence response mechanism for IDS. Keywords—Intrusion prevention, network security, optimal policy, Q-learning.

[1]  James Cannady Applying CMAC-based online learning to intrusion detection , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[2]  Cannady,et al.  Next Generation Intrusion Detection: Autonomous Reinforcement Learning of Network Attacks , 2000 .

[3]  Christin Schäfer,et al.  Learning Intrusion Detection: Supervised or Unsupervised? , 2005, ICIAP.

[4]  Sean P. Meyn,et al.  An analysis of reinforcement learning with function approximation , 2008, ICML '08.

[5]  Zheni Stefanova,et al.  Network attribute selection, classification and accuracy (NASCA) procedure for intrusion detection systems , 2017, 2017 IEEE International Symposium on Technologies for Homeland Security (HST).

[6]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[7]  Xin Xu,et al.  A Kernel-Based Reinforcement Learning Approach to Dynamic Behavior Modeling of Intrusion Detection , 2007, ISNN.

[8]  John N. Tsitsiklis,et al.  Asynchronous stochastic approximation and Q-learning , 1994, Mach. Learn..

[9]  Xin Xu,et al.  Kernel Least-Squares Temporal Difference Learning , 2006 .

[10]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[11]  Ufuk Topcu,et al.  Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints , 2014, Robotics: Science and Systems.

[12]  Yishay Mansour,et al.  Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..

[13]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Xin Xu,et al.  A Reinforcement Learning Approach for Host-Based Intrusion Detection Using Sequences of System Calls , 2005, ICIC.

[16]  Atsushi Inoue,et al.  Collaborative intrusion detection system , 2003, 22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003.