SmoteAdaNL: a learning method for network traffic classification

Machine learning based network traffic classification is a critical technique for network management, and has attracted much attention. Recently, most of the researchers focus on achieving high flow classification accuracy (FCA). However the amount of “mice” flows is more than that of “elephant” flows in the Internet, these classifiers hence are more suitable for “mice” flows, but have low byte classification accuracy (BCA). To address this issue, the notion of byte misclassification is firstly explored. According to the exploration that most misclassified bytes belong to the minority class, a novel method of network traffic classification is proposed by combining the data re-sampling and ensemble learning algorithms. To enhance the classification accuracy of the minority class, the data re-sampling algorithm is employed to increase the number of minority class flows. The data re-sampling however will change the data distribution and degrade the generalization of a classifier. A boosting-style ensemble learning algorithm with the consideration of ensemble diversity hence is employed to improve the generalization. The experiments conducted on the real-world traffic datasets show that the proposed method achieves over 90 % BCA and 96 % FCA on average, and improves about 7.15 % BCA by comparing with the existing methods.

[1]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[2]  Anirban Mahanti,et al.  Byte me: a case for byte accuracy in traffic classification , 2007, MineNet '07.

[3]  Yanghee Choi,et al.  NeTraMark: a network traffic classification benchmark , 2011, CCRV.

[4]  Andrew W. Moore,et al.  Discriminators for use in flow-based classification , 2013 .

[5]  Antonio Pescapè,et al.  Issues and future directions in traffic classification , 2012, IEEE Network.

[6]  Haitao He,et al.  Improve Flow Accuracy and Byte Accuracy in Network Traffic Classification , 2008, ICIC.

[7]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[8]  Leonard Barolli,et al.  Investigation of TCP and UDP multiple-flow traffic in wireless mobile ad-hoc networks , 2013, J. High Speed Networks.

[9]  Zhen Liu,et al.  Studying cost-sensitive learning for multi-class imbalance in Internet traffic classification , 2012 .

[10]  Zhen Liu,et al.  Method of data cleaning for network traffic classification , 2014 .

[11]  Albert Cabellos-Aparicio,et al.  Analysis of the impact of sampling on NetFlow traffic classification , 2011, Comput. Networks.

[12]  Francesco Palmieri,et al.  A nonlinear, recurrence-based approach to traffic classification , 2009, Comput. Networks.

[13]  Huanhuan Chen,et al.  Negative correlation learning for classification ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[14]  Carey L. Williamson,et al.  Offline/realtime traffic classification using semi-supervised learning , 2007, Perform. Evaluation.

[15]  Hiroshi Esaki,et al.  Unsupervised host behavior classification from connection patterns , 2010, Int. J. Netw. Manag..

[16]  Francesco Palmieri,et al.  On the detection of card-sharing traffic through wavelet analysis and Support Vector Machines , 2013, Appl. Soft Comput..

[17]  Zhi-Li Zhang,et al.  A Modular Machine Learning System for Flow-Level Traffic Classification in Large Networks , 2012, TKDD.

[18]  Huaqiang Yuan,et al.  Initiative movement prediction assisted adaptive handover trigger scheme in fast MIPv6 , 2012, Comput. Commun..

[19]  Ece Guran Schmidt,et al.  Machine learning algorithms for accurate flow-based network traffic classification: Evaluation and comparison , 2010, Perform. Evaluation.

[20]  Rastin Pries,et al.  Internet Access Traffic Measurement and Analysis , 2012, TMA.

[21]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[22]  2 Machine Learning Algorithms adapted in , .

[23]  Zhen Liu,et al.  Balanced feature selection method for Internet traffic classification , 2012, IET Networks.

[24]  Huaqiang Yuan,et al.  Active overload prevention based adaptive MAP selection in HMIPv6 networks , 2014, Wirel. Networks.

[25]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[26]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[27]  Wujian Ye,et al.  Hybrid P2P traffic classification with heuristic rules and machine learning , 2014, Soft Computing.

[28]  Ka Lun Eddie Law,et al.  QoS control framework for content satisfaction in ubiquitous multimedia computing , 2012, J. Ambient Intell. Humaniz. Comput..