Extreme trees network intrusion detection framework based on ensemble learning

Aming at the problem that the basic model has unstable performance in different network dataset, and the ensemble model will greatly increase the training time and testing time while improving the accuracy. This paper proposes a new intrusion detection model based on improved extreme trees. Firstly, this paper uses the feature selection to improve computing efficiency. And in order to improve the accuracy and adaptability, bagging is used to improve the extreme trees model and integrated the improved extreme trees and the Quadratic Discriminant Analysis in a maximization manner to obtain the learning results. Experiments on the KDDCUP99 dataset and UNSW-NB15 dataset verify that the training time and testing time of the new model is much shorter than the training and testing time and higher accuracy of the GBDT model. The training time and testing time of GBDT on the UNSW-NB15 dataset is 3.68 times that of the new model, and the accuracy rate is 2.27% higher than the GBDT model. In addition, the Fuzzers and Shellcode attacks in the UNSW-NB15 dataset were extracted and tested separately, which verified that the new model has excellent adaptability in various types of attack detection. Finally, the new model is combined with a blacklist mechanism and detection rules and applied to anomaly detection systems.

[1]  Min Zhang,et al.  A Network Intrusion Detection Algorithm Based on Outlier Mining , 2017, CSPS.

[2]  Yue Zhao,et al.  XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[3]  André C. Drummond,et al.  A Survey of Random Forest Based Methods for Intrusion Detection Systems , 2018, ACM Comput. Surv..

[4]  SahinFerat,et al.  A survey on feature selection methods , 2014 .

[5]  Joaquin Quiñonero Candela,et al.  Practical Lessons from Predicting Clicks on Ads at Facebook , 2014, ADKDD'14.

[6]  Basem E. Elnaghi,et al.  Osmotic Bio-Inspired Load Balancing Algorithm in Cloud Computing , 2019, IEEE Access.

[7]  Jill Slay,et al.  The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set , 2016, Inf. Secur. J. A Glob. Perspect..

[8]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[9]  Charu C. Aggarwal,et al.  Theoretical Foundations and Algorithms for Outlier Ensembles , 2015, SKDD.

[10]  Anil Kannur,et al.  An extremely randomized trees method for weapons classification based on wound patterns of sharp metals using ultrasound images , 2019 .

[11]  Parman Sukarno,et al.  Improving AdaBoost-based Intrusion Detection System (IDS) Performance on CIC IDS 2017 Dataset , 2019, Journal of Physics: Conference Series.

[12]  Yong Xia,et al.  GA-SVM based feature selection and parameter optimization in hospitalization expense modeling , 2019, Appl. Soft Comput..

[13]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[14]  Yue Zhao,et al.  Combining Machine Learning Models using combo Library , 2020, AAAI.

[15]  Meng Wang,et al.  Generative Adversarial Active Learning for Unsupervised Outlier Detection , 2018, IEEE Transactions on Knowledge and Data Engineering.