Early Identification of Services in HTTPS Traffic

Traffic monitoring is essential for network management tasks that ensure security and QoS. However, the continuous increase of HTTPS traffic undermines the effectiveness of current service-level monitoring that can only rely on unreliable parameters from the TLS handshake (X.509 certificate, SNI) or must decrypt the traffic. We propose a new machine learning-based method to identify HTTPS services without decryption. By extracting statistical features on TLS handshake packets and on a small number of application data packets, we can identify HTTPS services very early in the session. Extensive experiments performed over a significant and open dataset show that our method offers a good accuracy and a prototype implementation confirms that the early identification of HTTPS services is satisfied.

[1]  RIMT Maharaja Aggrasen A Near Real-time IP Traffic Classification Using Machine Learning , 2013 .

[2]  Renata Teixeira,et al.  Early Recognition of Encrypted Applications , 2007, PAM.

[3]  Sandrine Vaton,et al.  High‐speed flow‐based classification on FPGA , 2014, Int. J. Netw. Manag..

[4]  Jérôme François,et al.  A multi-level framework to identify HTTPS services , 2016, NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium.

[5]  José Everardo Bessa Maia,et al.  NTCS: A real time flow-based network traffic classification system , 2014, 10th International Conference on Network and Service Management (CNSM) and Workshop.

[6]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[7]  Shunyi Zhang,et al.  Real-Time P2P Traffic Identification , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[8]  Andrea Baiocchi,et al.  Real Time Identification of SSH Encrypted Application Flows by Using Cluster Analysis Techniques , 2009, Networking.

[9]  Ping Chen,et al.  Security Analysis of the Chinese Web: How well is it protected? , 2014, SafeConfig '14.

[10]  Youki Kadobayashi,et al.  Classification of SSL Servers based on their SSL Handshake for Automated Security Assessment , 2014, 2014 Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS).

[11]  Riyad Alshammari,et al.  Machine learning based encrypted traffic classification: Identifying SSH and Skype , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[12]  Ran Dubin,et al.  Analyzing HTTPS encrypted traffic to identify user's operating system, browser and application , 2016, 2017 14th IEEE Annual Consumer Communications & Networking Conference (CCNC).

[13]  Shunyi Zhang,et al.  Realtime Encrypted Traffic Identification using Machine Learning , 2011, J. Softw..

[14]  Chen Ning A Real-Time TCP Stream Reassembly Mechanism in High-Speed Network , 2009 .

[15]  Pavel Celeda,et al.  A survey of methods for encrypted traffic classification and analysis , 2015, Int. J. Netw. Manag..

[16]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[17]  陈宁,et al.  A Real-Time TCP Stream Reassembly Mechanism in High-Speed Network , 2009 .

[18]  Brian E. Carpenter,et al.  A flow-based performance analysis of TCP and TCP applications , 2012, 2012 18th IEEE International Conference on Networks (ICON).

[19]  Li Jun,et al.  Identifying Skype Traffic by Random Forest , 2007, 2007 International Conference on Wireless Communications, Networking and Mobile Computing.

[20]  Michael Langberg,et al.  Realtime Classification for Encrypted Traffic , 2010, SEA.

[21]  Dawei Wang,et al.  Traffic classification: Issues and challenges , 2013, 2013 International Conference on Computing, Networking and Communications (ICNC).

[22]  Shingo Ata,et al.  Towards real-time processing for application identification of encrypted traffic , 2014, 2014 International Conference on Computing, Networking and Communications (ICNC).

[23]  Isabelle Chrisment,et al.  Efficiently bypassing SNI-based HTTPS filtering , 2015, 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM).

[24]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[25]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[26]  Yan Grunenberger,et al.  The Cost of the "S" in HTTPS , 2014, CoNEXT.

[27]  A. Nur Zincir-Heywood,et al.  An investigation on identifying SSL traffic , 2011, 2011 IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA).