Novel HTTPS classifier driven by packet bursts, flows, and machine learning

Encryption of network traffic recently starts to cover remaining readable information, which is heavily used by current monitoring systems; thus, it is time to focus on novel methods of encrypted traffic analysis and classification. The aim of this paper is to define a new network traffic characteristic called Sequence of packet Burst Length and Time (SBLT), which was inspired by existing approaches and definitions. Contrary to other works, SBLT is feasible even for high-speed backbone networks as a part of IP flow data. The advantage of SBLT features is shown using a machine learning classification model for HTTPS traffic types as an example. This paper presents the definition of SBLT, proposes a new annotated public dataset of HTTPS traffic with 5 categories, and evaluates the developed classifier reaching accuracy over 99 %. This classifier can help analysts to deal with a huge amount of encrypted traffic and maintain situational awareness.

[1]  Aiko Pras,et al.  Flow-Based Web Application Brute-Force Attack and Compromise Detection , 2017, Journal of Network and Systems Management.

[2]  Eric Rescorla,et al.  TLS Encrypted Client Hello , 2020 .

[3]  Tomás Cejka,et al.  Detection of HTTPS Brute-Force Attacks with Packet-Level Feature Set , 2021, 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC).

[4]  Yan Shi,et al.  Website fingerprinting using traffic analysis of dynamic webpages , 2014, 2014 IEEE Global Communications Conference.

[5]  Rui Wang,et al.  Side-Channel Leaks in Web Applications: A Reality Today, a Challenge Tomorrow , 2010, 2010 IEEE Symposium on Security and Privacy.

[6]  José Francisco Martínez Trinidad,et al.  An Empirical Study of Oversampling and Undersampling Methods for LCMine an Emerging Pattern Based Classifier , 2013, MCPR.

[7]  Benoit Claise,et al.  Export of Structured Data in IP Flow Information Export (IPFIX) , 2011, RFC.

[8]  S. Eglen,et al.  Burst Detection Methods. , 2018, Advances in neurobiology.

[9]  Wei Li,et al.  Image-based Encrypted Traffic Classification with Convolution Neural Networks , 2020, 2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC).

[10]  Steven Bohez,et al.  Fingerprinting encrypted network traffic types using machine learning , 2018, NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium.

[11]  Mauro Conti,et al.  Robust Smartphone App Identification via Encrypted Network Traffic Analysis , 2017, IEEE Transactions on Information Forensics and Security.

[12]  Richard G. Baraniuk,et al.  Connection-level analysis and modeling of network traffic , 2001, IMW '01.

[13]  Thomas Ristenpart,et al.  Peek-a-Boo, I Still See You: Why Efficient Traffic Analysis Countermeasures Fail , 2012, 2012 IEEE Symposium on Security and Privacy.

[14]  Lan Yan,et al.  Learning to Classify: A Flow-Based Relation Network for Encrypted Traffic Classification , 2020, WWW.

[15]  Mahdi Jafari Siavoshani,et al.  Deep packet: a novel approach for encrypted traffic classification using deep learning , 2017, Soft Computing.