An accurate traffic classification model based on support vector machines

Network traffic classification is a fundamental research topic on high-performance network protocol design and network operation management. Compared with other state-of-the-art studies done on the network traffic classification, machine learning ML methods are more flexible and intelligent, which can automatically search for and describe useful structural patterns in a supplied traffic dataset. As a typical ML method, support vector machines SVMs based on statistical theory has high classification accuracy and stability. However, the performance of SVM classifier can be severely affected by the data scale, feature dimension, and parameters of the classifier. In this paper, a real-time accurate SVM training model named SPP-SVM is proposed. An SPP-SVM is deducted from the scaling dataset and employs principal component analysis PCA to extract data features and verify its relevant traffic features obtained from PCA. By employing PCA algorithm to do the dimension extraction, SPP-SVM confirms the critical component features, reduces the redundancy among them, and lowers the original feature dimension so as to reduce the over fitting and increase its generalization effectively. The optimal working parameters of kernel function used in SPP-SVM are derived automatically from improved particle swarm optimization algorithm, which will optimize the global solution and make its inertia weight coefficient adaptive without searching for the parameters in a wide range, traversing all the parameter points in the grid and adjusting steps gradually. The performance of its two- and multi-class classifiers is proved over 2 sets of traffic traces, coming from different topological points on the Internet. Experiments show that the SPP-SVM's two- and multi-class classifiers are superior to the typical supervised ML algorithms and performs significantly better than traditional SVM in classification accuracy, dimension, and elapsed time.

[1]  Didier Sornette,et al.  Accurate network anomaly classification with generalized entropy metrics , 2011, Comput. Networks.

[2]  Wanlei Zhou,et al.  Generating regular expression signatures for network traffic classification in trusted network management , 2012, J. Netw. Comput. Appl..

[3]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[4]  Antonio Pescapè,et al.  Issues and future directions in traffic classification , 2012, IEEE Network.

[5]  Dong Zhou,et al.  Translation techniques in cross-language information retrieval , 2012, CSUR.

[6]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[7]  Xiaohong Guan,et al.  Accurate Classification of the Internet Traffic Based on the SVM Method , 2007, 2007 IEEE International Conference on Communications.

[8]  Gabriel Gómez Sena,et al.  Statistical traffic classification by boosting support vector machines , 2012, International Latin American Networking Conference.

[9]  Lee Luan Ling,et al.  Internet traffic classification using multifractal analysis approach , 2012, SpringSim.

[10]  Riyad Alshammari,et al.  Performance Comparison of Four Rule Sets: An Example for Encrypted Traffic Classification , 2009, 2009 World Congress on Privacy, Security, Trust and the Management of e-Business.

[11]  Xiaohong Guan,et al.  An SVM-based machine learning method for accurate internet traffic classification , 2010, Inf. Syst. Frontiers.

[12]  Muhammad N. Marsono,et al.  Online NetFPGA decision tree statistical traffic classifier , 2013, Comput. Commun..

[13]  Gianni D'Angelo,et al.  Feature Extraction and Soft Computing Methods for Aerospace Structure Defect Classification , 2016, ArXiv.

[14]  Bhavani M. Thuraisingham,et al.  A new intrusion detection system using support vector machines and hierarchical clustering , 2007, The VLDB Journal.

[15]  Wei Wang,et al.  Construct support vector machine ensemble to detect traffic incident , 2009, Expert Syst. Appl..

[16]  Manuela Pereira,et al.  Detection and classification of peer-to-peer traffic: A survey , 2013, CSUR.

[17]  Rui Kuang,et al.  Improved prediction of malaria degradomes by supervised learning with SVM and profile kernel , 2009, Genetica.

[18]  Ian Goldberg,et al.  Enhancing Tor's performance using real-time traffic classification , 2012, CCS.

[19]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[20]  Xiaodong Li,et al.  Cooperatively Coevolving Particle Swarms for Large Scale Optimization , 2012, IEEE Transactions on Evolutionary Computation.

[21]  Dirk Grunwald,et al.  Legal issues surrounding monitoring during network research , 2007, IMC '07.

[22]  Detlef D. Nauck,et al.  Application of Bayesian Networks for Autonomic Network Management , 2013, Journal of Network and Systems Management.

[23]  Shun-Zheng Yu,et al.  Machine Learned Real-Time Traffic Classifiers , 2008, 2008 Second International Symposium on Intelligent Information Technology Application.

[24]  Patrick Siarry,et al.  Effect of the Dynamic Topology on the Performance of PSO-2S Algorithm for Continuous Optimization , 2015, MOD.

[25]  Vijay Kumar Tayal,et al.  Reduced order H∞ TCSC controller & PSO optimized fuzzy PSS design in mitigating small signal oscillations in a wide range , 2015 .

[26]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[27]  Michalis Faloutsos,et al.  Internet traffic classification demystified: myths, caveats, and the best practices , 2008, CoNEXT '08.

[28]  Jun Zhang,et al.  Network Traffic Classification Using Correlation Information , 2013, IEEE Transactions on Parallel and Distributed Systems.

[29]  Nojun Kwak,et al.  Generalized mean for robust principal component analysis , 2016, Pattern Recognit..

[30]  Maryam Vatankhah,et al.  Perceptual pain classification using ANFIS adapted RBF kernel support vector machine for therapeutic usage , 2013, Appl. Soft Comput..

[31]  Li Zhu,et al.  CALYPSO: A method for crystal structure prediction , 2012, Comput. Phys. Commun..

[32]  Zhi-Li Zhang,et al.  A Modular Machine Learning System for Flow-Level Traffic Classification in Large Networks , 2012, TKDD.

[33]  Ruggero G. Pensa,et al.  From Context to Distance: Learning Dissimilarity for Categorical Data Clustering , 2012, TKDD.

[34]  Wujian Ye,et al.  Hybrid P2P traffic classification with heuristic rules and machine learning , 2014, Soft Computing.

[35]  Zahir Tari,et al.  An optimal and stable feature selection approach for traffic classification based on multi-criterion fusion , 2014, Future Gener. Comput. Syst..

[36]  Antonio Nucci,et al.  Towards self adaptive network traffic classification , 2015, Comput. Commun..

[37]  Marco Mellia,et al.  Mining Unclassified Traffic Using Automatic Clustering Techniques , 2011, TMA.

[38]  Luca Salgarelli,et al.  Support Vector Machines for TCP traffic classification , 2009, Comput. Networks.

[39]  Andrew W. Moore,et al.  Bayesian Neural Networks for Internet Traffic Classification , 2007, IEEE Transactions on Neural Networks.

[40]  N. I. Yassin,et al.  Entropy based video watermarking scheme using wavelet transform and Principle Component Analysis , 2012, 2012 International Conference on Engineering and Technology (ICET).

[41]  Yang Xiang,et al.  An automatic application signature construction system for unknown traffic , 2010 .