Feature selection by multi-objective optimisation: Application to network anomaly detection by hierarchical self-organising maps

Feature selection is an important and active issue in clustering and classification problems. By choosing an adequate feature subset, a dataset dimensionality reduction is allowed, thus contributing to decreasing the classification computational complexity, and to improving the classifier performance by avoiding redundant or irrelevant features. Although feature selection can be formally defined as an optimisation problem with only one objective, that is, the classification accuracy obtained by using the selected feature subset, in recent years, some multi-objective approaches to this problem have been proposed. These either select features that not only improve the classification accuracy, but also the generalisation capability in case of supervised classifiers, or counterbalance the bias toward lower or higher numbers of features that present some methods used to validate the clustering/classification in case of unsupervised classifiers. The main contribution of this paper is a multi-objective approach for feature selection and its application to an unsupervised clustering procedure based on Growing Hierarchical Self-Organising Maps (GHSOMs) that includes a new method for unit labelling and efficient determination of the winning unit. In the network anomaly detection problem here considered, this multi-objective approach makes it possible not only to differentiate between normal and anomalous traffic but also among different anomalies. The efficiency of our proposals has been evaluated by using the well-known DARPA/NSL-KDD datasets that contain extracted features and labelled attacks from around 2 million connections. The selected feature sets computed in our experiments provide detection rates up to 99.8% with normal traffic and up to 99.6% with anomalous traffic, as well as accuracy values up to 99.12%.

[1]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[2]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[3]  Ali A. Ghorbani,et al.  IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS 1 Toward Credible Evaluation of Anomaly-Based Intrusion-Detection Methods , 2022 .

[4]  M.I. Heywood,et al.  Host-based intrusion detection using self-organizing maps , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[5]  Shingo Mabu,et al.  A novel intrusion detection system based on the 2-dimensional space distribution of average matching degree , 2011, SICE Annual Conference 2011.

[6]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[7]  Kijoon Chae,et al.  Attack Classification Based on Data Mining Technique and Its Application for Reliable Medical Sensor Communication , 2009, Int. J. Comput. Sci. Appl..

[8]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[9]  Robert Sabourin,et al.  Adaptive ROC-based ensembles of HMMs applied to anomaly detection , 2012, Pattern Recognit..

[10]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[12]  Xiaobo Zhou,et al.  A-GHSOM: An adaptive growing hierarchical self organizing map for network anomaly detection , 2012, J. Parallel Distributed Comput..

[13]  Jun He,et al.  A hybrid artificial immune system and Self Organising Map for network intrusion detection , 2008, Inf. Sci..

[14]  Andreas Rauber,et al.  The growing hierarchical self-organizing map: exploratory analysis of high-dimensional data , 2002, IEEE Trans. Neural Networks.

[15]  Eleazar Eskin,et al.  A GEOMETRIC FRAMEWORK FOR UNSUPERVISED ANOMALY DETECTION: DETECTING INTRUSIONS IN UNLABELED DATA , 2002 .

[16]  José Muñoz,et al.  Network Security Using Growing Hierarchical Self-Organizing Maps , 2009, ICANNGA.

[17]  Andrew H. Sung,et al.  Feature Ranking and Selection for Intrusion Detection Systems Using Support Vector Machines , 2002 .

[18]  Luiz Eduardo Soares de Oliveira,et al.  A Methodology for Feature Selection Using Multiobjective Genetic Algorithms for Handwritten Digit String Recognition , 2003, Int. J. Pattern Recognit. Artif. Intell..

[19]  Bernhard Sick,et al.  Training of radial basis function classifiers with resilient propagation and variational Bayesian inference , 2009, 2009 International Joint Conference on Neural Networks.

[20]  Robert Wagner,et al.  Technical data mining with evolutionary radial basis function classifiers , 2009, Appl. Soft Comput..

[21]  Mineichi Kudo,et al.  Classifier-independent feature selection on the basis of divergence criterion , 2006, Pattern Analysis and Applications.

[22]  Jean-Pierre Nziga,et al.  Minimal dataset for Network Intrusion Detection Systems via dimensionality reduction , 2011, 2011 Sixth International Conference on Digital Information Management.

[23]  David E. Goldberg,et al.  A niched Pareto genetic algorithm for multiobjective optimization , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[24]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[25]  Alexander Hofmann,et al.  Intrusion Detection in Computer Networks with Neural and Fuzzy Classifiers , 2003, ICANN.

[26]  Malcolm I. Heywood,et al.  A Hierarchical SOM based Intrusion Detection System , 2008 .

[27]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[28]  William Cyrus Navidi,et al.  Statistics for Engineers and Scientists , 2004 .

[29]  Julio Ortega Lopera,et al.  Network Anomaly Detection with Bayesian Self-Organizing Maps , 2013, IWANN.

[30]  John A. Clark,et al.  Evolutionary computation techniques for intrusion detection in mobile ad hoc networks , 2011, Comput. Networks.

[31]  Peyman Kabiri,et al.  Selection of Effective Network Parameters in Attacks for Intrusion Detection , 2010, ICDM.

[32]  Kalyanmoy Deb,et al.  A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II , 2000, PPSN.

[33]  Julio Ortega Lopera,et al.  Network Intrusion Prevention by Using Hierarchical Self-Organizing Maps and Probability-Based Labeling , 2011, IWANN.

[34]  Manas Ranjan Patra,et al.  Discriminative multinomial Naïve Bayes for network intrusion detection , 2010, 2010 Sixth International Conference on Information Assurance and Security.

[35]  Kalyanmoy Deb,et al.  Muiltiobjective Optimization Using Nondominated Sorting in Genetic Algorithms , 1994, Evolutionary Computation.

[36]  Chunlin Zhang,et al.  Intrusion detection using hierarchical neural networks , 2005, Pattern Recognit. Lett..

[37]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[38]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[39]  Filippo Menczer,et al.  Evolutionary model selection in unsupervised learning , 2002, Intell. Data Anal..

[40]  Fakhri Karray,et al.  Multi-objective Feature Selection with NSGA II , 2007, ICANNGA.

[41]  Ali A. Ghorbani,et al.  Improved competitive learning neural networks for network intrusion and fraud detection , 2012, Neurocomputing.

[42]  M. Bahrololum,et al.  Anomaly Intrusion Detection System Using Gaussian Mixture Model , 2008, 2008 Third International Conference on Convergence and Hybrid Information Technology.

[43]  M. Tahar Kechadi,et al.  Multi-objective feature selection by using NSGA-II for customer churn prediction in telecommunications , 2010, Expert Syst. Appl..

[44]  Thomas Hanne,et al.  On the convergence of multiobjective evolutionary algorithms , 1999, Eur. J. Oper. Res..

[45]  R.K. Cunningham,et al.  Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[46]  Julio Ortega Lopera,et al.  Network Anomaly Classification by Support Vector Classifiers Ensemble and Non-linear Projection Techniques , 2013, HAIS.

[47]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[48]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[49]  Thomas Weigert,et al.  An adaptive automatically tuning intrusion detection system , 2008, TAAS.

[50]  Gang Wang,et al.  Feature selection with conditional mutual information maximin in text categorization , 2004, CIKM '04.

[51]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[52]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[53]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[54]  C. Emmanouilidis,et al.  A multiobjective evolutionary setting for feature selection and a commonality-based crossover operator , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[55]  Flávio Bortolozzi,et al.  Unsupervised feature selection using multi-objective genetic algorithms for handwritten word recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[56]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[57]  Joshua D. Knowles,et al.  Feature subset selection in unsupervised learning via multiobjective optimization , 2006 .

[58]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[59]  R. M. Chandrasekaran,et al.  Intrusion detection using neural based hybrid classification methods , 2011, Comput. Networks.

[60]  Arputharaj Kannan,et al.  Decision tree based light weight intrusion detection using a wrapper approach , 2012, Expert Syst. Appl..

[61]  Yishi Zhang,et al.  Divergence-based feature selection for separate classes , 2013, Neurocomputing.

[62]  E. Arsuaga Uriarte,et al.  Topology Preservation in SOM , 2008 .

[63]  DebK.,et al.  A fast and elitist multiobjective genetic algorithm , 2002 .

[64]  Alexander Hofmann,et al.  On the versatility of radial basis function neural networks: A case study in the field of intrusion detection , 2010, Inf. Sci..

[65]  John McHugh,et al.  Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory , 2000, TSEC.