An incremental intrusion detection system using a new semi‐supervised stream classification method

Summary In recent years, the utilization of machine learning and data mining techniques for intrusion detection has received great attention by both security research communities and intrusion detection system (IDS) developers. In intrusion detection, the most important constraints are the imbalanced class distribution, the scarcity of the labeled data, and the massive amounts of network flows. Moreover, because of the dynamic nature of the network flows, applying static learned models degrades the detection performance significantly over time. In this article, we propose a new semi-supervised stream classification method for intrusion detection, which is capable of incremental updating using limited labeled data. The proposed method, called the incremental semi-supervised flow network-based IDS (ISF-NIDS), relies on an incremental mixed-data clustering, a new supervised cluster adjustment method, and an instance-based learning. The ISF-NIDS operates in real time and learns new intrusions quickly using limited storage and processing power. The experimental results on the KDD99, Moore, and Sperotto benchmark datasets indicate the superiority of the proposed method compared with the existing state-of-the-art incremental IDSs.

[1]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[2]  Philipp Winter,et al.  Inductive Intrusion Detection in Flow-Based Network Data Using One-Class Support Vector Machines , 2011, 2011 4th IFIP International Conference on New Technologies, Mobility and Security.

[3]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[4]  Hahn-Ming Lee,et al.  An Incremental-Learning Method for Supervised Anomaly Detection by Cascading Service Classifier and ITI Decision Tree Methods , 2009, PAISI.

[5]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[6]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[7]  Aiko Pras,et al.  A Labeled Data Set for Flow-Based Intrusion Detection , 2009, IPOM.

[8]  Na Li,et al.  Incremental Clustering Algorithm for Intrusion Detection Using Clonal Selection , 2008, 2008 IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application.

[9]  Christopher Krügel,et al.  Bayesian event classification for intrusion detection , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[10]  João Gama,et al.  Issues in evaluation of stream learning algorithms , 2009, KDD.

[11]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[12]  Christian Callegari,et al.  Improving PCA‐based anomaly detection by using multiple time scale analysis and Kullback–Leibler divergence , 2014, Int. J. Commun. Syst..

[13]  Zhang Yi,et al.  A hierarchical intrusion detection model based on the PCA neural networks , 2007, Neurocomputing.

[14]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[15]  David A. Cieslak,et al.  Combating imbalance in network intrusion datasets , 2006, 2006 IEEE International Conference on Granular Computing.

[16]  Noureddine Boudriga,et al.  Intrusion detection and tolerance: A global scheme , 2008 .

[17]  Mohamed Hamdi,et al.  Computer and network security risk management: theory, challenges, and countermeasures , 2005, Int. J. Commun. Syst..

[18]  Kotagiri Ramamohanarao,et al.  Layered Approach Using Conditional Random Fields for Intrusion Detection , 2010, IEEE Transactions on Dependable and Secure Computing.

[19]  Ming-Yang Su,et al.  Using clustering to improve the KNN-based classifiers for online anomaly network traffic identification , 2011, J. Netw. Comput. Appl..

[20]  Simin Nadjm-Tehrani,et al.  Adaptive real-time anomaly detection with incremental clustering , 2007, Inf. Secur. Tech. Rep..

[21]  Shen Furao,et al.  A fast nearest neighbor classifier based on self-organizing incremental neural network , 2008, Neural Networks.

[22]  Mohammad Zulkernine,et al.  Anomaly Based Network Intrusion Detection with Unsupervised Outlier Detection , 2006, 2006 IEEE International Conference on Communications.

[23]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[24]  Hui Wang,et al.  A clustering-based method for unsupervised intrusion detections , 2006, Pattern Recognit. Lett..

[25]  Shoushan Luo,et al.  Efficient intrusion detection using representative instances , 2013, Comput. Secur..

[26]  Ming-Yang Su,et al.  A real-time network intrusion detection system for large-scale attacks based on an incremental mining approach , 2009, Comput. Secur..

[27]  S.T. Sarasamma,et al.  Min-max hyperellipsoidal clustering for anomaly detection in network security , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[28]  Nitesh V. Chawla,et al.  Noname manuscript No. (will be inserted by the editor) Learning from Streaming Data with Concept Drift and Imbalance: An Overview , 2022 .

[29]  Carlos García Garino,et al.  Automatic network intrusion detection: Current techniques and open issues , 2012, Comput. Electr. Eng..

[30]  Justin M. Beaver,et al.  Nonparametric semi-supervised learning for network intrusion detection: combining performance improvements with realistic in-situ training , 2012, AISec.

[31]  V. Rao Vemuri,et al.  Adaptive anomaly detection with evolving connectionist systems , 2007, J. Netw. Comput. Appl..

[32]  Andrew J. Clark,et al.  Data preprocessing for anomaly based network intrusion detection: A review , 2011, Comput. Secur..

[33]  Chun-Hung Richard Lin,et al.  Intrusion detection system: A comprehensive review , 2013, J. Netw. Comput. Appl..

[34]  Wei Xu,et al.  Incremental SVM based on reserved set for network intrusion detection , 2011, Expert Syst. Appl..

[35]  Simin Nadjm-Tehrani,et al.  ADWICE - Anomaly Detection with Real-Time Incremental Clustering , 2004, ICISC.

[36]  Philip K. Chan,et al.  Learning nonstationary models of normal network traffic for detecting novel attacks , 2002, KDD.

[37]  Abdolreza Mirzaei,et al.  An incremental mixed data clustering method using a new distance measure , 2015, Soft Comput..