Efficient intrusion detection using representative instances

Because of their feasibility and effectiveness, artificial intelligence-based intrusion detection systems attract considerable interest from researchers. However, when confronted with large-scale data sets, many artificial intelligence-based intrusion detection systems could suffer from a high computational burden, even though the feature selection method can help to reduce the computational complexity. To improve the efficiency, we propose a representative instance selection method to preprocess the original data set before training a classifier, which is independent of the learning algorithm that is used for constructing the intrusion detection system. In this study, a new metric is introduced to measure the representative power of an instance with respect to its class. Based on an implementation of representativeness, we select the most representative instance in each subset divided by a novel centroid-based partitioning strategy, and then, we utilise the result as training data to build various intrusion detection models efficiently. Experimental results on a labelled flow-based data set introduced in 2009 show that ANN, KNN, SVM and Liblinear learning with a largely reduced set of representative instances can not only achieve high efficiency in detecting network attacks but also provide comparable detection performance in terms of the detection rate, precision, F-score and accuracy, as compared with four corresponding classifiers built with the original large data set.

[1]  Mansour Sheikhan,et al.  Flow-based anomaly detection in high-speed links using modified GSA-optimized neural network , 2012, Neural Computing and Applications.

[2]  Ming-Yang Su,et al.  Using clustering to improve the KNN-based classifiers for online anomaly network traffic identification , 2011, J. Netw. Comput. Appl..

[3]  Shi-Jinn Horng,et al.  A novel intrusion detection system based on hierarchical clustering and support vector machines , 2011, Expert Syst. Appl..

[4]  Sam Kwong,et al.  Genetic-fuzzy rule mining approach and evaluation of feature selection techniques for anomaly intrusion detection , 2007, Pattern Recognition.

[5]  George Karypis,et al.  Centroid-Based Document Classification: Analysis and Experimental Results , 2000, PKDD.

[6]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[7]  Ajith Abraham,et al.  Feature deduction and ensemble design of intrusion detection systems , 2005, Comput. Secur..

[8]  Aiko Pras,et al.  A Labeled Data Set for Flow-Based Intrusion Detection , 2009, IPOM.

[9]  Philipp Winter,et al.  Inductive Intrusion Detection in Flow-Based Network Data Using One-Class Support Vector Machines , 2011, 2011 4th IFIP International Conference on New Technologies, Mobility and Security.

[10]  Andrew J. Clark,et al.  Data preprocessing for anomaly based network intrusion detection: A review , 2011, Comput. Secur..

[11]  Arputharaj Kannan,et al.  Decision tree based light weight intrusion detection using a wrapper approach , 2012, Expert Syst. Appl..

[12]  Nasser Yazdani,et al.  Mutual information-based feature selection for intrusion detection systems , 2011, J. Netw. Comput. Appl..

[13]  Sheng-Hsun Hsu,et al.  Application of SVM and ANN for intrusion detection , 2005, Comput. Oper. Res..

[14]  F. Cuppens,et al.  Efficient Intrusion Detection Using Principal Component Analysis , 2003 .

[15]  Wolfgang Banzhaf,et al.  The use of computational intelligence in intrusion detection systems: A review , 2010, Appl. Soft Comput..

[16]  Adel Nadjaran Toosi,et al.  A new approach to intrusion detection based on an evolutionary soft computing model using neuro-fuzzy classifiers , 2007, Comput. Commun..

[17]  Clayton R. Pereira,et al.  An Optimum-Path Forest framework for intrusion detection in computer networks , 2012, Eng. Appl. Artif. Intell..

[18]  Chih-Fong Tsai,et al.  A triangle area based nearest neighbors approach to intrusion detection , 2010, Pattern Recognit..

[19]  Chen Wang,et al.  Visual analysis of large-scale network anomalies , 2013, IBM J. Res. Dev..

[20]  Yang Wei,et al.  Anomaly Intrusion Detection Approach Using Hybrid MLP/CNN Neural Network , 2006, Sixth International Conference on Intelligent Systems Design and Applications.

[21]  SuMing-Yang Using clustering to improve the KNN-based classifiers for online anomaly network traffic identification , 2011 .

[22]  Habiba Drias,et al.  An intrusion detection and alert correlation approach based on revising probabilistic classifiers using expert knowledge , 2012, Applied Intelligence.

[23]  GuoChun,et al.  Efficient intrusion detection using representative instances , 2013 .

[24]  Somnuk Phon-Amnuaisuk,et al.  A cascaded classifier approach for improving detection rates on rare attack categories in network intrusion detection , 2010, Applied Intelligence.

[25]  Nicolás García-Pedrajas,et al.  A divide-and-conquer recursive approach for scaling up instance selection algorithms , 2009, Data Mining and Knowledge Discovery.

[26]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[27]  FRANK ysym,et al.  DEGREE OF , 2007 .

[28]  Taeshik Shon,et al.  A hybrid machine learning approach to network anomaly detection , 2007, Inf. Sci..

[29]  Salvatore J. Stolfo,et al.  Data mining methods for detection of new malicious executables , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[30]  G. Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[31]  Jiming Peng,et al.  Estimating Bounds for Quadratic Assignment Problems Associated with Hamming and Manhattan Distance Matrices Based on Semidefinite Programming , 2010, SIAM J. Optim..

[32]  Chou-Yuan Lee,et al.  An intelligent algorithm with feature selection and decision rules applied to anomaly intrusion detection , 2012, Appl. Soft Comput..

[33]  Francisco Herrera,et al.  Stratification for scaling up evolutionary prototype selection , 2005, Pattern Recognit. Lett..

[34]  Noorhaniza Wahid,et al.  A hybrid network intrusion detection system using simplified swarm optimization (SSO) , 2012, Appl. Soft Comput..

[35]  Wei-Yang Lin,et al.  Intrusion detection by machine learning: A review , 2009, Expert Syst. Appl..

[36]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[37]  Yang Li,et al.  A lightweight web server anomaly detection method based on transductive scheme and genetic algorithms , 2008, Comput. Commun..

[38]  Malcolm I. Heywood,et al.  Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 , 2005, PST.

[39]  Yang Li,et al.  Building lightweight intrusion detection system using wrapper-based feature selection mechanisms , 2009, Comput. Secur..

[40]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[41]  V. Rao Vemuri,et al.  Use of K-Nearest Neighbor classifier for intrusion detection , 2002, Comput. Secur..

[42]  Jaideep Srivastava,et al.  Data Mining for Network Intrusion Detection , 2002 .