Leveraging Inner-Connection of Message Sequence for Traffic Classification: A Deep Learning Approach

Classifying traffic flows into source applications is of great value for intelligent network management, which can help to detect malicious attacks, monitor the network, optimize network behaviors and then improve user experience, etc. However, to achieve high-accuracy traffic classification, especially in real time, is very challenging due to very complicated behaviors of traffic flows where network applications could often transmit traffics with encryption at randomized port numbers under highly dynamic network conditions. In this paper, by collecting extensive application traffic flows at the exit router of Shanghai Maritime University (the traffic rate can reach up to 7 GB/s at peak time), we identify that there is a very distinct characteristic in inner-connection of message (grouped by single or multiple consecutive TCP packets) sequence for different application flows. We then propose our traffic classification algorithm, which essentially adopts a Long Short-Term Memory (LSTM) neural network to output a classifier with message sequence vector (not necessarily covering all messages) of a traffic flow as the training input, to conduct online traffic flow classification. Extensive simulations are conduced considering varied training data size and diverse source applications, and an average about 97 % accuracy on per-flow classification can be achieved.

[1]  Jürgen Schmidhuber,et al.  Multi-column deep neural network for traffic sign classification , 2012, Neural Networks.

[2]  Carey L. Williamson,et al.  Offline/realtime traffic classification using semi-supervised learning , 2007, Perform. Evaluation.

[3]  Maurizio Dusi,et al.  Tunnel Hunter: Detecting application-layer tunnels with statistical fingerprinting , 2009, Comput. Networks.

[4]  Luca Salgarelli,et al.  Support Vector Machines for TCP traffic classification , 2009, Comput. Networks.

[5]  Minglu Li,et al.  Intelligent Context-Aware Communication Paradigm Design for IoVs Based on Data Analytics , 2018, IEEE Network.

[6]  Albert Trelis Saiz Independent comparison of popular DPI tools for traffic classification , 2016 .

[7]  Scott E. Coull,et al.  Traffic Analysis of Encrypted Messaging Services: Apple iMessage and Beyond , 2014, CCRV.

[8]  Adi Suryaputra Paramita Improving K-NN Internet Traffic Classification Using Clustering and Principle Component Analysis , 2017 .

[9]  Guangtao Xue,et al.  Noff: A Novel Extendible Parallel Library for High-Performance Network Traffic Monitoring , 2017, 2017 24th Asia-Pacific Software Engineering Conference (APSEC).

[10]  Yan Luo,et al.  Efficient memory utilization on network processors for deep packet inspection , 2006, 2006 Symposium on Architecture For Networking And Communications Systems.

[11]  Dario Rossi,et al.  Reviewing Traffic Classification , 2013, Data Traffic Monitoring and Analysis.

[12]  Fulvio Risso,et al.  Lightweight, Payload-Based Traffic Classification: An Experimental Evaluation , 2008, 2008 IEEE International Conference on Communications.

[13]  Chun-Ying Huang,et al.  High performance traffic classification based on message size sequence and distribution , 2016, J. Netw. Comput. Appl..

[14]  Li Wei,et al.  Network Traffic Classification Using K-means Clustering , 2007 .

[15]  Maurizio Dusi,et al.  Detecting HTTP Tunnels with Statistical Mechanisms , 2007, 2007 IEEE International Conference on Communications.

[16]  Zhisong Pan,et al.  Network traffic classification via non-convex multi-task feature learning , 2015, Neurocomputing.

[17]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[18]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[19]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[20]  Dawei Wang,et al.  Characterizing Application Behaviors for classifying P2P traffic , 2014, 2014 International Conference on Computing, Networking and Communications (ICNC).

[21]  Andrew W. Moore,et al.  Bayesian Neural Networks for Internet Traffic Classification , 2007, IEEE Transactions on Neural Networks.

[22]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[23]  Wenchao Xu,et al.  Big Data Driven Vehicular Networks , 2018, IEEE Network.

[24]  Jun Zhang,et al.  Internet Traffic Classification Using Constrained Clustering , 2014, IEEE Transactions on Parallel and Distributed Systems.

[25]  Maurizio Longo,et al.  Revealing encrypted WebRTC traffic via machine learning tools , 2015, 2015 12th International Joint Conference on e-Business and Telecommunications (ICETE).

[26]  Bo Yang,et al.  Traffic classification using probabilistic neural networks , 2010, 2010 Sixth International Conference on Natural Computation.

[27]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.