A novel region adaptive SMOTE algorithm for intrusion detection on imbalanced problem

Machine learning techniques play a crucial part in intrusion detection and greatly change the original intrusion detection methods. How to use machine learning technologies to achieve better detection results is important. However, due to defects in the machine learning algorithms and the data imbalance problem between the attack behaviors and the normal behaviors in the network, the detection rate of low-frequent attack behaviors cannot be effectively improved. In order to solve this issue, from the consideration of data level, a novel Region Adaptive Synthetic Minority Oversampling Technique (RA-SMOTE) is proposed. Three different types of classifiers, including support vector machines (SVM), BP neural network (BPNN), and random forests (RF), are used to test the effectiveness of the algorithm. Empirical results test on DSL-KDD dataset show that the proposed algorithm can effectively solve the class imbalance problem and improve the detection rate of low-frequent attacks.

[1]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[2]  Sung-Bae Cho,et al.  A probabilistic multi-class strategy of one-vs.-rest support vector machines for cancer classification , 2008, Neurocomputing.

[3]  Khan A. Wahid,et al.  Learning from imbalanced data: A comprehensive comparison of classifier performance for bleeding detection in endoscopic video , 2016, 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV).

[4]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[5]  Jian Ma,et al.  A new approach to intrusion detection using Artificial Neural Networks and fuzzy clustering , 2010, Expert Syst. Appl..

[6]  Satish R. Kolhe,et al.  Effective intrusion detection system using semi-supervised learning , 2014, 2014 International Conference on Data Mining and Intelligent Computing (ICDMIC).

[7]  Nouria Harbi,et al.  Approach Based Ensemble Methods for Better and Faster Intrusion Detection , 2011, CISIS.

[8]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[9]  Neelam Sharma,et al.  INTRUSION DETECTION USING NAIVE BAYES CLASSIFIER WITH FEATURE REDUCTION , 2012 .

[10]  Taghi M. Khoshgoftaar,et al.  Intrusion detection and Big Heterogeneous Data: a Survey , 2015, Journal of Big Data.

[11]  Ciza Thomas,et al.  Improving intrusion detection for imbalanced network traffic , 2013, Secur. Commun. Networks.

[12]  Rushi Longadge,et al.  Class Imbalance Problem in Data Mining Review , 2013, ArXiv.

[13]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[14]  Lior Rokach,et al.  Fast-CBUS: A fast clustering-based undersampling method for addressing the class imbalance problem , 2017, Neurocomputing.

[15]  Malik Sikander Hayat Khiyal,et al.  Analysis of Machine Learning Techniques for Intrusion Detection System: A Review , 2015 .

[16]  Khaled Ragab,et al.  Genetic fuzzy system for intrusion detection: Analysis of improving of multiclass classification accuracy using KDDCup-99 imbalance dataset , 2012, 2012 12th International Conference on Hybrid Intelligent Systems (HIS).

[17]  Kangfeng Zheng,et al.  Intrusion detection algorithm based on density, cluster centers, and nearest neighbors , 2016, China Communications.

[18]  Mohammad Reza Parsaei,et al.  A Hybrid Data Mining Approach for Intrusion Detection on Imbalanced NSL-KDD Dataset , 2016 .