A feature selection approach to find optimal feature subsets for the network intrusion detection system

The performance of network intrusion detection systems based on machine learning techniques in terms of accuracy and efficiency largely depends on the selected features. However, choosing the optimal subset of features from a number of commonly used features to detect network intrusion requires extensive computing resources. The number of possible feature subsets from given n features is 2$$^{n}-1$$n-1. In this paper, to tackle this problem we propose an optimal feature selection algorithm. Proposed algorithm is based on a local search algorithm, one of the representative meta-heuristic algorithms for solving computationally hard optimization problems. Particularly, the accuracy of clustering obtained by applying k-means clustering algorithm to the training data set is exploited to measure the goodness of a feature subset as a cost function. In order to evaluate the performance of our proposed algorithm, comparisons with a feature set composed of all 41 features are carried out over the NSL-KDD data set using a multi-layer perceptron.

[1]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[2]  Vijay Kumar Jha,et al.  Data Mining in Intrusion Detection: A Comparative Study of Methods, Types and Data Sets , 2013 .

[3]  Malcolm I. Heywood,et al.  Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 , 2005, PST.

[4]  Georgios Kambourakis,et al.  Swarm intelligence in intrusion detection: A survey , 2011, Comput. Secur..

[5]  S. V. Raghavan,et al.  Intrusion detection through learning behavior model , 2001, Comput. Commun..

[6]  Amin Allahyar,et al.  Fast Feature Reduction in intrusion detection datasets , 2012, 2012 Proceedings of the 35th International Convention MIPRO.

[7]  Adetunmbi A. Olusola,et al.  Analysis of KDD '99 Intrusion Detection Dataset for Selection of Relevance Features , 2010 .

[8]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[9]  KambourakisG.,et al.  Swarm intelligence in intrusion detection , 2011 .

[10]  Ajith Abraham,et al.  Feature deduction and ensemble design of intrusion detection systems , 2005, Comput. Secur..

[11]  Ajith Abraham,et al.  Hybrid Feature Selection for Modeling Intrusion Detection Systems , 2004, ICONIP.

[12]  Swati Paliwal,et al.  Denial-of-Service, Probing & Remote to User (R2L) Attack Detection using Genetic Algorithm , 2012 .

[13]  Gürsel Serpen,et al.  Application of Machine Learning Algorithms to KDD Intrusion Detection Dataset within Misuse Detection Context , 2003, MLMTA.

[14]  Seung-Ho Kang A Feature Selection Algorithm to Find Optimal Feature Subsets for Detecting DoS Attacks , 2015, 2015 5th International Conference on IT Convergence and Security (ICITCS).

[15]  Octavio Nieto-Taladriz,et al.  Improving network security using genetic algorithm approach , 2007, Comput. Electr. Eng..

[16]  Zenglin Xu,et al.  Discriminative Semi-Supervised Feature Selection Via Manifold Regularization , 2009, IEEE Transactions on Neural Networks.

[17]  Slobodan Petrovic,et al.  A Comparison of Feature-Selection Methods for Intrusion Detection , 2010, MMM-ACNS.

[18]  K. Premalatha,et al.  A Survey on Feature Selection Methods in Microarray Gene Expression Data for Cancer Classification , 2017 .

[19]  Wei-Yang Lin,et al.  Intrusion detection by machine learning: A review , 2009, Expert Syst. Appl..

[20]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[21]  Ali A. Ghorbani,et al.  Network Intrusion Detection and Prevention - Concepts and Techniques , 2010, Advances in Information Security.

[22]  Wolfgang Banzhaf,et al.  The use of computational intelligence in intrusion detection systems: A review , 2010, Appl. Soft Comput..