A survey of methods for encrypted traffic classification and analysis

With the widespread use of encrypted data transport, network traffic encryption is becoming a standard nowadays. This presents a challenge for traffic measurement, especially for analysis and anomaly detection methods, which are dependent on the type of network traffic. In this paper, we survey existing approaches for classification and analysis of encrypted traffic. First, we describe the most widespread encryption protocols used throughout the Internet. We show that the initiation of an encrypted connection and the protocol structure give away much information for encrypted traffic classification and analysis. Then, we survey payload and feature-based classification methods for encrypted traffic and categorize them using an established taxonomy. The advantage of some of described classification methods is the ability to recognize the encrypted application protocol in addition to the encryption protocol. Finally, we make a comprehensive comparison of the surveyed feature-based classification methods and present their weaknesses and strengths. Copyright © 2015 John Wiley & Sons, Ltd.

[1]  Michael E. Papka,et al.  The web page , 2000 .

[2]  Angela Orebaugh,et al.  Guide to IPsec VPNs , 2005 .

[3]  M. Kubát An Introduction to Machine Learning , 2017, Springer International Publishing.

[4]  Russ Housley,et al.  Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile , 2002, RFC.

[5]  Tim Dierks,et al.  The Transport Layer Security (TLS) Protocol Version 1.2 , 2008 .

[6]  Maurizio Martinelli,et al.  nDPI: Open-source high-speed deep packet inspection , 2014, 2014 International Wireless Communications and Mobile Computing Conference (IWCMC).

[7]  Riyad Alshammari,et al.  Machine learning based encrypted traffic classification: Identifying SSH and Skype , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[8]  Timo Hämäläinen,et al.  A real time unsupervised NIDS for detecting unknown and encrypted network attacks in high speed network , 2013, 2013 IEEE International Workshop on Measurements & Networking (M&N).

[9]  Riyad Alshammari,et al.  A Preliminary Performance Comparison of Two Feature Sets for Encrypted Traffic Classification , 2008, CISIS.

[10]  Tatu Ylönen,et al.  The Secure Shell (ssh) Transport Layer Protocol , 2006 .

[11]  Russ Housley,et al.  Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile , 2002, RFC.

[12]  Timothy Stapko CHAPTER 4 – The Secure Sockets Layer , 2008 .

[13]  Alan O. Freier,et al.  Internet Engineering Task Force (ietf) the Secure Sockets Layer (ssl) Protocol Version 3.0 , 2022 .

[14]  Pere Barlet-Ros,et al.  Independent comparison of popular DPI tools for traffic classification , 2015, Comput. Networks.

[15]  Riyad Alshammari,et al.  An investigation on the identification of VoIP traffic: Case study on Gtalk and Skype , 2010, 2010 International Conference on Network and Service Management.

[16]  Malcolm I. Heywood,et al.  An Investigation of Multi-objective Genetic Algorithms for Encrypted Traffic Identification , 2009, CISIS.

[17]  A. Nur Zincir-Heywood,et al.  A Comparison of three machine learning techniques for encrypted network traffic analysis , 2011, 2011 IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA).

[18]  Ye Du,et al.  Design of a method for encrypted P2P traffic identification using K-means algorithm , 2013, Telecommunication Systems.

[19]  Ling Huang,et al.  I Know Why You Went to the Clinic: Risks and Realization of HTTPS Traffic Analysis , 2014, Privacy Enhancing Technologies.

[20]  Charles V. Wright,et al.  On Inferring Application Protocol Behaviors in Encrypted Network Traffic , 2006, J. Mach. Learn. Res..

[21]  Meng Zhang,et al.  Encrypted Traffic Classification Based on an Improved Clustering Algorithm , 2012, ISCTCS.

[22]  Judith Kelner,et al.  A Survey on Internet Traffic Identification , 2009, IEEE Communications Surveys & Tutorials.

[23]  Michael Langberg,et al.  Realtime Classification for Encrypted Traffic , 2010, SEA.

[24]  Dan Boneh,et al.  An Experimental Study of TLS Forward Secrecy Deployments , 2014, IEEE Internet Computing.

[25]  Riyad Alshammari,et al.  A flow based approach for SSH traffic detection , 2007, 2007 IEEE International Conference on Systems, Man and Cybernetics.

[26]  Andrea Baiocchi,et al.  Real Time Identification of SSH Encrypted Application Flows by Using Cluster Analysis Techniques , 2009, Networking.

[27]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[28]  Zigang Cao,et al.  A Survey on Encrypted Traffic Classification , 2014 .

[29]  Stephen Farrell,et al.  Pervasive Monitoring Is an Attack , 2014, RFC.

[30]  Amir R. Khakpour,et al.  An Information-Theoretical Approach to High-Speed Flow Nature Identification , 2013, IEEE/ACM Transactions on Networking.

[31]  Eric Rescorla,et al.  The Transport Layer Security (TLS) Protocol Version 1.2 , 2008, RFC.

[32]  Antonio Pescapè,et al.  Issues and future directions in traffic classification , 2012, IEEE Network.

[33]  Tim Wright,et al.  Transport Layer Security (TLS) Extensions , 2003, RFC.

[34]  J. Alex Halderman,et al.  Analysis of the HTTPS certificate ecosystem , 2013, Internet Measurement Conference.

[35]  Georg Carle,et al.  The SSL landscape: a thorough analysis of the x.509 PKI using active and passive measurements , 2011, IMC '11.

[36]  Andrzej Duda,et al.  Classifying service flows in the encrypted skype traffic , 2012, 2012 IEEE International Conference on Communications (ICC).

[37]  Christian Callegari,et al.  Skype-Hunter: A real-time system for the detection and classification of Skype traffic , 2012, Int. J. Commun. Syst..

[38]  Min Zhang,et al.  State of the Art in Traffic Classification: A Research Review , 2009 .

[39]  Andrzej Duda,et al.  Markov chain fingerprinting to classify encrypted traffic , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[40]  Dongsheng Wang,et al.  An Novel Hybrid Method for Effectively Classifying Encrypted Traffic , 2010, 2010 IEEE Global Telecommunications Conference GLOBECOM 2010.

[41]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[42]  Gabi Dreo Rodosek,et al.  Command Evaluation in Encrypted Remote Sessions , 2010, 2010 Fourth International Conference on Network and System Security.

[43]  Andrew W. Moore,et al.  Discriminators for use in flow-based classification , 2013 .

[44]  Renata Teixeira,et al.  Early Recognition of Encrypted Applications , 2007, PAM.

[45]  Li Guo,et al.  Using Entropy to Classify Traffic More Deeply , 2011, 2011 IEEE Sixth International Conference on Networking, Architecture, and Storage.

[46]  Aiko Pras,et al.  SSHCure: A Flow-Based SSH Intrusion Detection System , 2012, AIMS.

[47]  Riyad Alshammari,et al.  Can encrypted traffic be identified without port numbers, IP addresses and payload inspection? , 2011, Comput. Networks.

[48]  Malcolm I. Heywood,et al.  Genetic optimization and hierarchical clustering applied to encrypted traffic identification , 2011, 2011 IEEE Symposium on Computational Intelligence in Cyber Security (CICS).

[49]  Randall J. Atkinson,et al.  IP Encapsulating Security Payload (ESP) , 1995, RFC.

[50]  Shingo Ata,et al.  Application identification from encrypted traffic based on characteristic changes by encryption , 2011, 2011 IEEE International Workshop Technical Committee on Communications Quality and Reliability (CQR).

[51]  Eduardo Rocha,et al.  A Survey of Payload-Based Traffic Classification Approaches , 2014, IEEE Communications Surveys & Tutorials.

[52]  Paul E. Hoffman,et al.  Internet Key Exchange Protocol Version 2 (IKEv2) , 2010, RFC.

[53]  Jesús E. Díaz-Verdejo,et al.  A multilevel taxonomy and requirements for an optimal traffic‐classification model , 2014, Int. J. Netw. Manag..

[54]  Perry Metzger,et al.  Encapsulating Security Payload (ESP) , 1995 .

[55]  Shingo Ata,et al.  Towards real-time processing for application identification of encrypted traffic , 2014, 2014 International Conference on Computing, Networking and Communications (ICNC).

[56]  Malcolm I. Heywood,et al.  Classifying SSH encrypted traffic with minimum packet header features using genetic programming , 2009, GECCO '09.