Sequential Message Characterization for Early Classification of Encrypted Internet Traffic

Classifying Internet traffic is critical to many network management tasks, including malicious attack detection, usage monitoring, load balancing, etc. As current traffic packets are often transmitted with encryption, at randomized port numbers, and under highly dynamic network conditions, traditional approaches such as port mapping, deep packet inspection, and statistical analysis are no longer effective. In this paper, we first collect extensive traffic flows at the exit router of a university and label them into various source applications. After extracting the message (consisting of multiple consecutive TCP packets) sequence for all collected traffic flows, we find that each application type has distinct sequential message features. By leveraging the message sequential feature, we develop a system, named SMC (Sequential Message Characterization), which can perform online traffic classification with the sequential size information of a few message segments. In SMC, after confirming the long-term dependency among message segments, we create a Long Short-Term Memory (LSTM) neural network to conduct deep learning on message size sequence, and then build a multi-classifier to classify traffic types based on the probability profiles output by deep LSTM models. Extensive experiments are conducted and results demonstrate that the proposed SMC can achieve 97% of classification accuracy on average. Meanwhile, with as few as 6 pieces of message size information as input, SMC enables early online traffic classification especially for heavy-traffic flows with over 35 message segments in median.

[1]  Pere Barlet-Ros,et al.  Independent comparison of popular DPI tools for traffic classification , 2015, Comput. Networks.

[2]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[3]  Yoshua Bengio,et al.  Drawing and Recognizing Chinese Characters with Recurrent Neural Network , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Adi Suryaputra Paramita Improving K-NN Internet Traffic Classification Using Clustering and Principle Component Analysis , 2017 .

[5]  Andrew W. Moore,et al.  Bayesian Neural Networks for Internet Traffic Classification , 2007, IEEE Transactions on Neural Networks.

[6]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[7]  Minglu Li,et al.  LeaD: Large-Scale Edge Cache Deployment Based on Spatio-Temporal WiFi Traffic Statistics , 2021, IEEE Transactions on Mobile Computing.

[8]  Guangtao Xue,et al.  Noff: A Novel Extendible Parallel Library for High-Performance Network Traffic Monitoring , 2017, 2017 24th Asia-Pacific Software Engineering Conference (APSEC).

[9]  Jun Zhang,et al.  Internet Traffic Classification Using Constrained Clustering , 2014, IEEE Transactions on Parallel and Distributed Systems.

[10]  Vijay Sivaraman,et al.  Classifying IoT Devices in Smart Environments Using Network Traffic Characteristics , 2019, IEEE Transactions on Mobile Computing.

[11]  Bo Yang,et al.  Traffic classification using probabilistic neural networks , 2010, 2010 Sixth International Conference on Natural Computation.

[12]  Yan Luo,et al.  Efficient memory utilization on network processors for deep packet inspection , 2006, 2006 Symposium on Architecture For Networking And Communications Systems.

[13]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[14]  Dawei Wang,et al.  Characterizing Application Behaviors for classifying P2P traffic , 2014, 2014 International Conference on Computing, Networking and Communications (ICNC).

[15]  Luca Salgarelli,et al.  Support Vector Machines for TCP traffic classification , 2009, Comput. Networks.

[16]  Ju Ren,et al.  Serving at the Edge: A Scalable IoT Architecture Based on Transparent Computing , 2017, IEEE Network.

[17]  Scott E. Coull,et al.  Traffic Analysis of Encrypted Messaging Services: Apple iMessage and Beyond , 2014, CCRV.

[18]  Jun Qin,et al.  POST: Exploiting Dynamic Sociality for Mobile Advertising in Vehicular Networks , 2014, IEEE Transactions on Parallel and Distributed Systems.

[19]  Wenchao Xu,et al.  Big Data Driven Vehicular Networks , 2018, IEEE Network.

[20]  Feng Lyu,et al.  Edge Coordinated Query Configuration for Low-Latency and Accurate Video Analytics , 2020, IEEE Transactions on Industrial Informatics.

[21]  Dario Rossi,et al.  KISS: Stochastic Packet Inspection Classifier for UDP Traffic , 2010, IEEE/ACM Transactions on Networking.

[22]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[23]  Ning Zhang,et al.  Content Popularity Prediction Towards Location-Aware Mobile Edge Caching , 2018, IEEE Transactions on Multimedia.

[24]  Xiaodong Lin,et al.  Anonymous Reputation System for IIoT-Enabled Retail Marketing Atop PoS Blockchain , 2019, IEEE Transactions on Industrial Informatics.

[25]  Dario Rossi,et al.  Reviewing Traffic Classification , 2013, Data Traffic Monitoring and Analysis.

[26]  Ju Ren,et al.  A Survey on End-Edge-Cloud Orchestrated Network Computing Paradigms , 2019, ACM Comput. Surv..

[27]  Jr. G. Forney,et al.  Viterbi Algorithm , 1973, Encyclopedia of Machine Learning.

[28]  Maurizio Longo,et al.  Revealing encrypted WebRTC traffic via machine learning tools , 2015, 2015 12th International Joint Conference on e-Business and Telecommunications (ICETE).

[29]  Li Wei,et al.  Network Traffic Classification Using K-means Clustering , 2007 .

[30]  Nadra Guizani,et al.  The Best of Both Worlds: A General Architecture for Data Management in Blockchain-enabled Internet-of-Things , 2020, IEEE Network.

[31]  Carey L. Williamson,et al.  Offline/realtime traffic classification using semi-supervised learning , 2007, Perform. Evaluation.

[32]  Nei Kato,et al.  A Mobility Analytical Framework for Big Mobile Data in Densely Populated Area , 2017, IEEE Transactions on Vehicular Technology.

[33]  Fulvio Risso,et al.  Lightweight, Payload-Based Traffic Classification: An Experimental Evaluation , 2008, 2008 IEEE International Conference on Communications.

[34]  Wenchao Xu,et al.  Internet of vehicles in big data era , 2018, IEEE/CAA Journal of Automatica Sinica.

[35]  Maurizio Dusi,et al.  Tunnel Hunter: Detecting application-layer tunnels with statistical fingerprinting , 2009, Comput. Networks.

[36]  Nei Kato,et al.  Construction of a Flexibility Analysis Model for Flexible High-Throughput Satellite Communication Systems With a Digital Channelizer , 2018, IEEE Transactions on Vehicular Technology.