Feature selection and intrusion classification in NSL-KDD cup 99 dataset employing SVMs

Intrusion is the violation of information security policy by malicious activities. Intrusion detection (ID) is a series of actions for detecting and recognising suspicious actions that make the expedient acceptance of standards of confidentiality, quality, consistency, and availability of a computer based network system. In this paper, we present a new approach consists with merging of feature selection and classification for multiple class NSL-KDD cup 99 intrusion detection dataset employing support vector machine (SVM). The objective is to improve the competence of intrusion classification with a significantly reduced set of input features from the training data. In supervised learning, feature selection is the process of selecting the important input training features and removing the irrelevant input training features, with the objective of obtaining a feature subset that produces higher classification accuracy. In the experiment, we have applied SVM classifier on several input feature subsets of training dataset of NSL-KDD cup 99 dataset. The experimental results obtained showed the proposed method successfully bring 91% classification accuracy using only three features and 99% classification accuracy using 36 features, while all 41 training features achieved 99% classification accuracy.

[1]  Dewan Md. Farid,et al.  Mining Complex Data Streams: Discretization, Attribute Selection and Classification , 2013 .

[2]  Dewan Md. Farid,et al.  Adaptive Intrusion Detection based on Boosting and Naïve Bayesian Classifier , 2011 .

[3]  Dewan Md. Farid,et al.  Mining Complex Network Data for Adaptive Intrusion Detection , 2012 .

[4]  Rohit P. Soni,et al.  Java UI Designer , 2012 .

[5]  Li Zhang,et al.  Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks , 2014, Expert Syst. Appl..

[6]  Dewan Md. Farid,et al.  Assigning Weights to Training Instances Increases Classification Accuracy , 2013 .

[7]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[8]  Yang Li,et al.  Building lightweight intrusion detection system using wrapper-based feature selection mechanisms , 2009, Comput. Secur..

[9]  Dewan Md. Farid,et al.  Combining Naive Bayes and Decision Tree for Adaptive Intrusion Detection , 2010, ArXiv.

[10]  A. J. M. Abu Afza,et al.  A Hybrid Classifier using Boosting, Clustering, and Naïve Bayesian Classifier , 2011 .

[11]  Li Zhang,et al.  An adaptive ensemble classifier for mining concept drifting data streams , 2013, Expert Syst. Appl..

[12]  Dewan Md. Farid,et al.  Ensemble of Decision Tree Classifiers for Mining Web Data Streams , 2012 .

[13]  Dewan Md. Farid,et al.  Attribute Weighting with Adaptive NBTree for Reducing False Positives in Intrusion Detection , 2010, ArXiv.

[14]  Andrew H. Sung,et al.  Feature Selection for Intrusion Detection with Neural Networks and Support Vector Machines , 2003 .

[15]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[16]  Dewan Md. Farid,et al.  An Ensemble Approach to Classifier Construction based on Bootstrap Aggregation , 2011 .

[17]  Dewan Md. Farid,et al.  Anomaly Network Intrusion Detection Based on Improved Self Adaptive Bayesian Algorithm , 2010, J. Comput..