Designing of on line intrusion detection system using rough set theory and Q-learning algorithm

Development of an efficient real time intrusion detection system (IDS) has been proposed in the paper by integrating Q-learning algorithm and rough set theory (RST). The objective of the work is to achieve maximum classification accuracy while detecting intrusions by classifying NSL-KDD network traffic data either 'normal' or 'anomaly'. Since RST processes discrete data only, by applying cut operation attributes in training data are discretized. Using indiscernibility concept of RST, reduced attribute sets, called reducts are obtained and among the reducts a single reduct is chosen which provides highest classification accuracy. However, for the test data the same reduct would not provide highest classification accuracy due to change of discretized attribute values. Therefore, to overcome the problem discretization and feature selection processes are dealt in a comprehensive and systematic way in the paper using machine learning approach. The Q-learning algorithm has been modified to learn optimum cut value for different attributes so that corresponding reduct produces maximum classification accuracy while classifying network traffic data. Since, not all attributes but reduct only take part to detect intrusions, the proposed algorithm is faster than Q-learning and reduces complexity of the IDS. Classification accuracy with 98% success rate has been obtained using real time data, which demonstrates superior performance compared to other classifiers.

[1]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[2]  Daniel Kudenko,et al.  Multi-agent Reinforcement Learning for Intrusion Detection , 2007, Adaptive Agents and Multi-Agents Systems.

[3]  Fabio A. González,et al.  An immunity-based technique to characterize intrusions in computer networks , 2002, IEEE Trans. Evol. Comput..

[4]  Yishay Mansour,et al.  Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..

[5]  Nicholas Roy,et al.  CORL: A Continuous-state Offset-dynamics Reinforcement Learner , 2008, UAI.

[6]  Michael K. Ng,et al.  Agglomerative Fuzzy K-Means Clustering Algorithm with Selection of Number of Clusters , 2008, IEEE Transactions on Knowledge and Data Engineering.

[7]  Philip K. Chan,et al.  Learning rules for anomaly detection of hostile network traffic , 2003, Third IEEE International Conference on Data Mining.

[8]  Lihong Li,et al.  PAC model-free reinforcement learning , 2006, ICML.

[9]  M. S. Manju,et al.  An Analysis of Q-Learning Algorithms with Strategies of Reward Function , 2011 .

[10]  Jack Sklansky,et al.  On Automatic Feature Selection , 1988, Int. J. Pattern Recognit. Artif. Intell..

[11]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[12]  Lihong Li,et al.  A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.

[13]  Roman W. Świniarski,et al.  Rough sets methods in feature reduction and classification , 2001 .

[14]  Meng Jianliang,et al.  The Application on Intrusion Detection Based on K-means Cluster Algorithm , 2009, 2009 International Forum on Information Technology and Applications.

[15]  Michael L. Littman,et al.  Efficient Reinforcement Learning with Relocatable Action Models , 2007, AAAI.

[16]  Mario Tokoro,et al.  An Adaptive Architecture for Modular Q-Learning , 1997, IJCAI.

[17]  T. S. Chou,et al.  Network Intrusion Detection Design Using Feature Selection of Soft Computing Paradigms , 2008 .

[18]  Vir V. Phoha,et al.  K-Means+ID3: A Novel Method for Supervised Anomaly Detection by Cascading K-Means Clustering and ID3 Decision Tree Learning Methods , 2007, IEEE Transactions on Knowledge and Data Engineering.

[19]  Qingshan Jiang,et al.  An Intrusion Detection System Based on the Clustering Ensemble , 2007, 2007 International Workshop on Anti-Counterfeiting, Security and Identification (ASID).

[20]  Sham M. Kakade,et al.  On the sample complexity of reinforcement learning. , 2003 .

[21]  Jaya Sil,et al.  An efficient classifier design integrating rough set and set oriented database operations , 2011, Appl. Soft Comput..

[22]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[23]  Peter Auer,et al.  Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..

[24]  Xiangji Huang,et al.  Feature Selection with Rough Sets for Web Page Classification , 2004, Trans. Rough Sets.

[25]  Xin Xu Adaptive Intrusion Detection Based on Machine Learning : Feature Extraction , Classifier Construction and Sequential Pattern Prediction , 2006 .

[26]  Daniel Kudenko,et al.  Multi-Agent Reinforcement Learning for Intrusion Detection: A Case Study and Evaluation , 2008, MATES.

[27]  Chen Zhang,et al.  K-means Clustering Algorithm with Improved Initial Center , 2009, 2009 Second International Workshop on Knowledge Discovery and Data Mining.

[28]  Giuseppe Serazzi,et al.  Unsupervised learning algorithms for intrusion detection , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[29]  R. Słowiński Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory , 1992 .

[30]  Michael L. Littman,et al.  Potential-based Shaping in Model-based Reinforcement Learning , 2008, AAAI.

[31]  Xin Xu Adaptive Intrusion Detection Based on Machine Learning: Feature Extraction, Classifier Construction and Sequential Pattern Prediction , 2006 .