An ensemble framework of anomaly detection using hybridized feature selection approach (HFSA)

Rapid growth and popularity of internet has re-emphasized the significance of the intrusion detection system in network security. To overcome the vulnerabilities of network security researchers have come up with different frameworks of intrusion detection system using data mining. Feature selection is a significant method to develop a time and cost effective intrusion detection system. The time consumption in building up the classifiers model enhances the efficiency of the system. This work conducted on the analysis of some approaches of intrusion detection using some machine learning methods with wrapper approaches, which is a type of feature selection methodology. Our paper mainly focuses on the classification preciseness of 3 different classifiers using the minimal amount of features selected by three different wrapper search methods on the well-known public type NSL-KDD dataset and showing the comparisons among them. The 3 basic classifiers are Bayesian Network, Naive Bayes and J48. Best First, Genetic Search and Rank Search have been used as the wrapper search methods. The study proposed an ensemble type of a classification model with a hybrid feature selection method based on the research framework. By using the hybrid feature selection method 12 critical features are chosen and with the combination of basic classifiers, a reliable model is developed to differentiate normal and anomaly. Moreover, the result shows a convenient false positive rate of 0.021. Experiment showed that our proposed ensemble approach showed better result than Naive Bayes, Bayesian Network and J48 classifier. Experiments have been conducted on the NSL-KDD dataset using WEKA 3.6 library functions.

[1]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[2]  Driss Aboutajdine,et al.  A two-stage gene selection scheme utilizing MRMR filter and GA wrapper , 2011, Knowledge and Information Systems.

[3]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[4]  Verónica Bolón-Canedo,et al.  An ensemble of filters and classifiers for microarray data classification , 2012, Pattern Recognit..

[5]  Verónica Bolón-Canedo,et al.  A review of feature selection methods on synthetic data , 2013, Knowledge and Information Systems.

[6]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[7]  Sarit Kraus,et al.  Obtaining scalable and accurate classification in large-scale spatio-temporal domains , 2011, Knowledge and Information Systems.

[8]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[9]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[10]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[11]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[12]  Asha Gowda Karegowda,et al.  Feature Subset Selection Problem using Wrapper Approach in Supervised Learning , 2010 .

[13]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[14]  Patrick van der Smagt,et al.  Introduction to neural networks , 1995, The Lancet.

[15]  C. Ding,et al.  Gene selection algorithm by combining reliefF and mRMR , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[16]  John McHugh,et al.  Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory , 2000, TSEC.

[17]  S. Sameen Fatima,et al.  Features Selection Method for Automatic Text Categorization: A Comparative Study with WEKA and RapidMiner Tools , 2014 .

[18]  Dewan Md. Farid,et al.  Application of Machine Learning Approaches in Intrusion Detection System: A Survey , 2015 .

[19]  James A. Mahaffey,et al.  Multiple Self-Organizing Maps for Intrusion Detection , 2000 .

[20]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[21]  Yonghong Peng,et al.  A novel feature selection approach for biomedical data classification , 2010, J. Biomed. Informatics.

[22]  Juan José Rodríguez Diez,et al.  A weighted voting framework for classifiers ensembles , 2012, Knowledge and Information Systems.

[23]  Rowayda A. Sadek,et al.  Effective Anomaly Intrusion Detection System based on Neural Network with Indicator Variable and Rou , 2013 .