Encrypted Traffic Identification Based on N-gram Entropy and Cumulative Sum Test

Since existing methods using entropy are less effective in characterizing encrypted traffic, this paper proposes an encrypted traffic identification method based on n-gram entropy and cumulative sum. This method analyzes the entropy characteristics of n-gram entropy for text, picture, compressed file, and encrypted traffic in the network. Furthermore, an analysis of cumulative sum is performed to better distinguish compressed file traffic and encrypted traffic. The experiments show that our propsed method reaches high accuracy for encrypted traffic identification and performs well in distinguishing compressed file traffic and encrypted traffic.

[1]  Lin Sen,et al.  Internet Traffic Classification Using C4.5 Decision Tree , 2009 .

[2]  Amir R. Khakpour,et al.  An Information-Theoretical Approach to High-Speed Flow Nature Identification , 2013, IEEE/ACM Transactions on Networking.

[3]  Pavel Celeda,et al.  A survey of methods for encrypted traffic classification and analysis , 2015, Int. J. Netw. Manag..

[4]  Riyad Alshammari,et al.  Investigating Two Different Approaches for Encrypted Traffic Classification , 2008, 2008 Sixth Annual Conference on Privacy, Security and Trust.

[5]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[6]  Mahdi Jafari Siavoshani,et al.  Deep packet: a novel approach for encrypted traffic classification using deep learning , 2017, Soft Computing.

[7]  Peng Xu,et al.  Internet Traffic Classification Using C4.5 Decision Tree: Internet Traffic Classification Using C4.5 Decision Tree , 2009 .

[8]  Bo Zhao,et al.  Protocol Independent Identification of Encrypted Traffic Based on Weighted Cumulative Sum Test: Protocol Independent Identification of Encrypted Traffic Based on Weighted Cumulative Sum Test , 2014 .

[9]  Carey L. Williamson,et al.  A Longitudinal Study of P2P Traffic Classification , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[10]  Renata Teixeira,et al.  Early Recognition of Encrypted Applications , 2007, PAM.

[11]  Riyad Alshammari,et al.  Generalization of signatures for SSH encrypted traffic identification , 2009, 2009 IEEE Symposium on Computational Intelligence in Cyber Security.

[12]  Robert Lyda,et al.  Using Entropy Analysis to Find Encrypted and Packed Malware , 2007, IEEE Security & Privacy.

[13]  Subharthi Paul,et al.  Deciphering malware’s use of TLS (without decryption) , 2016, Journal of Computer Virology and Hacking Techniques.

[14]  Elaine B. Barker,et al.  A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications , 2000 .

[15]  Guang Cheng,et al.  WENC: HTTPS Encrypted Traffic Classification Using Weighted Ensemble Learning and Markov Chain , 2017, 2017 IEEE Trustcom/BigDataSE/ICESS.

[16]  Xiong Gang Research Progress and Prospects of Network Traffic Classification , 2012 .

[17]  Riyad Alshammari,et al.  A flow based approach for SSH traffic detection , 2007, 2007 IEEE International Conference on Systems, Man and Cybernetics.

[18]  Elaine B. Barker,et al.  A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications , 2000 .

[19]  Blake Anderson,et al.  Identifying Encrypted Malware Traffic with Contextual Flow Data , 2016, AISec@CCS.