An in-depth experimental study of anomaly detection using gradient boosted machine

This paper proposes an improved detection performance of anomaly-based intrusion detection system (IDS) using gradient boosted machine (GBM). The best parameters of GBM are obtained by performing grid search. The performance of GBM is then compared with the four renowned classifiers, i.e. random forest, deep neural network, support vector machine, and classification and regression tree in terms of four performance measures, i.e. accuracy, specificity, sensitivity, false positive rate and area under receiver operating characteristic curve (AUC). From the experimental result, it can be revealed that GBM significantly outperforms the most recent IDS techniques, i.e. fuzzy classifier, two-tier classifier, GAR-forest, and tree-based classifier ensemble. These results are the highest so far applied on the complete features of three different datasets, i.e. NSL-KDD, UNSW-NB15, and GPRS dataset using either tenfold cross-validation or hold-out method. Moreover, we prove our results by conducting two statistical significant tests which are yet to discover in the existing IDS researches.

[1]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[2]  Fabio Roli,et al.  Intrusion detection in computer networks by a modular ensemble of one-class classifiers , 2008, Inf. Fusion.

[3]  Gholamhossein Dastghaibyfard,et al.  Two-tier network anomaly detection model: a machine learning approach , 2017, Journal of Intelligent Information Systems.

[4]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Kagan Tumer,et al.  Classifier ensembles: Select real-world applications , 2008, Inf. Fusion.

[7]  Václav Snásel,et al.  Fuzzy classification by evolutionary algorithms , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.

[8]  Manas Ranjan Patra,et al.  Discriminative multinomial Naïve Bayes for network intrusion detection , 2010, 2010 Sixth International Conference on Information Assurance and Security.

[9]  Arputharaj Kannan,et al.  Decision tree based light weight intrusion detection using a wrapper approach , 2012, Expert Syst. Appl..

[10]  R. Lewis An Introduction to Classification and Regression Tree (CART) Analysis , 2000 .

[11]  W. J. Conover,et al.  Practical Nonparametric Statistics , 1972 .

[12]  Jasmin Kevric,et al.  An effective combining classifier approach using tree algorithms for network intrusion detection , 2017, Neural Computing and Applications.

[13]  Ailton Akira Shinoda,et al.  A dataset for evaluating intrusion detection systems in IEEE 802.11 wireless networks , 2014, 2014 IEEE Colombian Conference on Communications and Computing (COLCOM).

[14]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[15]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[16]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[17]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[18]  Kandasamy Muniasamy,et al.  Improving the Accuracy of Intrusion Detection Using GAR-Forest with Feature Selection , 2015, FICTA.

[19]  M. F. Fuller,et al.  Practical Nonparametric Statistics; Nonparametric Statistical Inference , 1973 .

[20]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[21]  R. M. Chandrasekaran,et al.  Intrusion detection using neural based hybrid classification methods , 2011, Comput. Networks.

[22]  Ajith Abraham,et al.  Feature deduction and ensemble design of intrusion detection systems , 2005, Comput. Secur..

[23]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[24]  Andrew H. Sung,et al.  Intrusion detection using an ensemble of intelligent paradigms , 2005, J. Netw. Comput. Appl..

[25]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[26]  Ahmad Akbari,et al.  New class-dependent feature transformation for intrusion detection systems , 2012, Secur. Commun. Networks.

[27]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[28]  Jill Slay,et al.  The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set , 2016, Inf. Secur. J. A Glob. Perspect..

[29]  Bayu Adhi Tama,et al.  Performance Analysis of Multiple Classifier System in DoS Attack Detection , 2015, WISA.

[30]  Hany M. Harb,et al.  Adaboost Ensemble with Genetic Algorithm Post Optimization for Intrusion Detection , 2011 .

[31]  Nour Moustafa,et al.  UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set) , 2015, 2015 Military Communications and Information Systems Conference (MilCIS).

[32]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[33]  Bayu Adhi Tama,et al.  A Combination of PSO-Based Feature Selection and Tree-Based Classifiers Ensemble for Intrusion Detection Systems , 2015, CSA/CUTE.

[34]  Bayu Adhi Tama,et al.  Performance evaluation of intrusion detection system using classifier ensembles , 2017, Int. J. Internet Protoc. Technol..

[35]  Bayu Adhi Tama,et al.  Classifier Ensemble Design with Rotation Forest to Enhance Attack Detection of IDS in Wireless Network , 2016, 2016 11th Asia Joint Conference on Information Security (AsiaJCIS).