Integrated Intrusion Detection Model Using Chi-Square Feature Selection and Ensemble of Classifiers

Intrusion detection system is a device or software application that monitors a network of systems to identify any malicious activity or policy violations. In order to identify intrusions or normal activity, IDS would consider different network-related features such as source address, protocol and flag. The major challenge for any intrusion detection model is to achieve maximum accuracy with minimal false alarms. The aim of this paper is to identify the critical features required in the construction of intrusion detection model, thereby achieving the maximum accuracy. The model utilizes an ensemble approach of classifiers with minimum complexity to overcome the issues in the existing ensemble-based intrusion detection models. In this paper, Chi-square feature selection and the ensemble of classifiers such as support vector machine (SVM), modified Naive Bayes (MNB) and LPBoost are utilized to develop an intrusion detection model. The motivation for selecting Chi-square feature selection is that they rank the features based on the statistical significance test and consider only those features that are dependent on the class label. Supervised classifiers are highly consistent and produce precise results as the use of training data improves the ability to distinguish between classes with similar features. Experimental results indicate high accuracy in comparison with base classifiers by the ensemble of LPBoost. As there is a huge class imbalance present in the network traffic, the prediction of the class label by a majority voting of SVM, MNB and LPBoost is an optimal solution in preference to reliance on a single classifier.

[1]  Shih-Wei Lin,et al.  Particle swarm optimization for parameter determination and feature selection of support vector machines , 2008, Expert Syst. Appl..

[2]  Manas Ranjan Patra,et al.  Hybrid intelligent systems for detecting network intrusions , 2015, Secur. Commun. Networks.

[3]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[4]  Alexandre Balon-Perin,et al.  Ensemble-based methods for intrusion detection , 2012 .

[5]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[6]  Chou-Yuan Lee,et al.  Efficiently solving general weapon-target assignment problem by genetic algorithms with greedy eugenics , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[7]  Xiangji Huang,et al.  Mining network data for intrusion detection through combining SVMs with ant colony networks , 2014, Future Gener. Comput. Syst..

[8]  Sam Kwong,et al.  Genetic-fuzzy rule mining approach and evaluation of feature selection techniques for anomaly intrusion detection , 2007, Pattern Recognition.

[9]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[10]  Jian Ma,et al.  A new approach to intrusion detection using Artificial Neural Networks and fuzzy clustering , 2010, Expert Syst. Appl..

[11]  Andrew H. Sung,et al.  Intrusion detection using an ensemble of intelligent paradigms , 2005, J. Netw. Comput. Appl..

[12]  S. Nash,et al.  Linear and Nonlinear Programming , 1987 .

[13]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[14]  Carlos Martín-Vide,et al.  Evolutionary Design of Intrusion Detection Programs , 2007, Int. J. Netw. Secur..

[15]  Shuo-Yan Chou,et al.  A simulated-annealing-based approach for simultaneous parameter optimization and feature selection of back-propagation networks , 2008, Expert Syst. Appl..

[16]  P. S. Avadhani,et al.  Genetic Algorithm based Weight Extraction Algorithm for Artificial Neural Network Classifier in Intrusion Detection , 2012 .

[17]  Zied Elouedi,et al.  Naive Bayes vs decision trees in intrusion detection systems , 2004, SAC '04.

[18]  Cherukuri Aswani Kumar,et al.  Improving Accuracy of Intrusion Detection Model Using PCA and optimized SVM , 2016, J. Comput. Inf. Technol..

[19]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[20]  Yu-Lin He,et al.  Fuzzy nonlinear regression analysis using a random weight network , 2016, Inf. Sci..

[21]  Xin Yao,et al.  Evolving hybrid ensembles of learning machines for better generalisation , 2006, Neurocomputing.

[22]  Yu-Lin He,et al.  Random weight network-based fuzzy nonlinear regression for trapezoidal fuzzy number data , 2017, Appl. Soft Comput..

[23]  Ravi Jain,et al.  D-SCIDS: Distributed soft computing intrusion detection system , 2007, J. Netw. Comput. Appl..

[24]  Cheng Xiang,et al.  Design of Multiple-Level Hybrid Classifier for Intrusion Detection System , 2005, 2005 IEEE Workshop on Machine Learning for Signal Processing.

[25]  Li Guo,et al.  An active learning based TCM-KNN algorithm for supervised network intrusion detection , 2007, Comput. Secur..

[26]  Ajith Abraham,et al.  Modeling intrusion detection system using hybrid intelligent systems , 2007, J. Netw. Comput. Appl..

[27]  Ajith Abraham,et al.  Feature deduction and ensemble design of intrusion detection systems , 2005, Comput. Secur..

[28]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[29]  Hussein A. Abbass,et al.  An adaptive genetic-based signature learning system for intrusion detection , 2009, Expert Syst. Appl..

[30]  Bhavani M. Thuraisingham,et al.  A new intrusion detection system using support vector machines and hierarchical clustering , 2007, The VLDB Journal.

[31]  I. Sumaiya Thaseen,et al.  Intrusion detection model using fusion of PCA and optimized SVM , 2014, 2014 International Conference on Contemporary Computing and Informatics (IC3I).

[32]  Siyang Zhang,et al.  A novel hybrid KPCA and SVM with GA model for intrusion detection , 2014, Appl. Soft Comput..