Mobile Encrypted Traffic Classification Using Deep Learning: Experimental Evaluation, Lessons Learned, and Challenges

The massive adoption of hand-held devices has led to the explosion of mobile traffic volumes traversing home and enterprise networks, as well as the Internet. Traffic classification (TC), i.e., the set of procedures for inferring (mobile) applications generating such traffic, has become nowadays the enabler for highly valuable profiling information (with certain privacy downsides), other than being the workhorse for service differentiation/blocking. Nonetheless, the design of accurate classifiers is exacerbated by the raising adoption of encrypted protocols (such as TLS), hindering the suitability of (effective) deep packet inspection approaches. Also, the fast-expanding set of apps and the moving-target nature of mobile traffic makes design solutions with usual machine learning, based on manually and expert-originated features, outdated and unable to keep the pace. For these reasons deep learning (DL) is here proposed, for the first time, as a viable strategy to design practical mobile traffic classifiers based on automatically extracted features, able to cope with encrypted traffic, and reflecting their complex traffic patterns. To this end, different state-of-the-art DL techniques from (standard) TC are here reproduced, dissected (highlighting critical choices), and set into a systematic framework for comparison, including also a performance evaluation workbench. The latter outcome, although declined in the mobile context, has the applicability appeal to the wider umbrella of encrypted TC tasks. Finally, the performance of these DL classifiers is critically investigated based on an exhaustive experimental validation (based on three mobile datasets of real human users’ activity), highlighting the related pitfalls, design guidelines, and challenges.

[1]  Antonio Pescapè,et al.  Identification of Traffic Flows Hiding behind TCP Port 80 , 2010, 2010 IEEE International Conference on Communications.

[2]  Nino Vincenzo Verde,et al.  Analyzing Android Encrypted Network Traffic to Identify User Actions , 2016, IEEE Transactions on Information Forensics and Security.

[3]  Mauro Conti,et al.  Robust Smartphone App Identification via Encrypted Network Traffic Analysis , 2017, IEEE Transactions on Information Forensics and Security.

[4]  Jesús E. Díaz-Verdejo,et al.  Network traffic application identification based on message size analysis , 2015, J. Netw. Comput. Appl..

[5]  Wei Lin,et al.  Traffic Identification of Mobile Apps Based on Variational Autoencoder Network , 2017, 2017 13th International Conference on Computational Intelligence and Security (CIS).

[6]  Mohsen Imani,et al.  Deep Fingerprinting: Undermining Website Fingerprinting Defenses with Deep Learning , 2018, CCS.

[7]  Mauro Conti,et al.  AppScanner: Automatic Fingerprinting of Smartphone Apps from Encrypted Network Traffic , 2016, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[8]  Wenbo He,et al.  I know what you did on your smartphone: Inferring app usage over encrypted data traffic , 2015, 2015 IEEE Conference on Communications and Network Security (CNS).

[9]  Nicholas Hopper,et al.  p1-FP: Extraction, Classification, and Prediction of Website Fingerprints with Deep Learning , 2019, Proc. Priv. Enhancing Technol..

[10]  Malcolm I. Heywood,et al.  Smart Phone User Behaviour Characterization Based on Autoencoders and Self Organizing Maps , 2016, 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW).

[11]  Mahdi Jafari Siavoshani,et al.  Deep packet: a novel approach for encrypted traffic classification using deep learning , 2017, Soft Computing.

[12]  Min Huang,et al.  Deep learning–based network application classification for SDN , 2018, Trans. Emerg. Telecommun. Technol..

[13]  Packet Momentum for Identification of Anonymity Networks , 2017 .

[14]  Antonio Pescapè,et al.  Mobile Encrypted Traffic Classification Using Deep Learning , 2018, 2018 Network Traffic Measurement and Analysis Conference (TMA).

[15]  Brian Neil Levine,et al.  Inferring the source of encrypted HTTP connections , 2006, CCS '06.

[16]  Antonio Pescapè,et al.  Traffic identification engine: an open platform for traffic classification , 2014, IEEE Network.

[17]  Yong Liao,et al.  SAMPLES: Self Adaptive Mining of Persistent LExical Snippets for Classifying Mobile Application Traffic , 2015, MobiCom.

[18]  Wouter Joosen,et al.  Automated Feature Extraction for Website Fingerprinting through Deep Learning. , 2017 .

[19]  Hui Xiong,et al.  A Multi-Label Multi-View Learning Framework for In-App Service Usage Analysis , 2018, ACM Trans. Intell. Syst. Technol..

[20]  Wouter Joosen,et al.  Automated Website Fingerprinting through Deep Learning , 2017, NDSS.

[21]  Ming Zhu,et al.  Malware traffic classification using convolutional neural network for representation learning , 2017, 2017 International Conference on Information Networking (ICOIN).

[22]  Giuseppe Aceto,et al.  PortLoad: Taking the Best of Two Worlds in Traffic Classification , 2010, 2010 INFOCOM IEEE Conference on Computer Communications Workshops.

[23]  Ivan Martinovic,et al.  Who do you sync you are?: smartphone fingerprinting via application behaviour , 2013, WiSec '13.

[24]  Ali A. Ghorbani,et al.  Characterization of Encrypted and VPN Traffic using Time-related Features , 2016, ICISSP.

[25]  Antonio Pescapè,et al.  K-Dimensional Trees for Continuous Traffic Classification , 2010, TMA.

[26]  Nguyen Quang Uy,et al.  A Deep Learning Based Method for Handling Imbalanced Problem in Network Traffic Classification , 2017, SoICT.

[27]  Antonio Pescapè,et al.  Issues and future directions in traffic classification , 2012, IEEE Network.

[28]  Jasleen Kaur,et al.  Can Android Applications Be Identified Using Only TCP/IP Headers of Their Launch Time Traffic? , 2016, WISEC.

[29]  He Huang,et al.  Automatic Multi-task Learning System for Abnormal Network Traffic Detection , 2018, Int. J. Emerg. Technol. Learn..

[30]  Nicholas Hopper,et al.  Traffic Analysis with Deep Learning , 2017, ArXiv.

[31]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[32]  Ming Zhu,et al.  End-to-end encrypted traffic classification with one-dimensional convolution neural networks , 2017, 2017 IEEE International Conference on Intelligence and Security Informatics (ISI).

[33]  Omer Gurewitz,et al.  Traffic Classification Based on Zero-Length Packets , 2018, IEEE Transactions on Network and Service Management.

[34]  Narseo Vallina-Rodriguez,et al.  Studying TLS Usage in Android Apps , 2018, ANRW.

[35]  Renata Teixeira,et al.  Early application identification , 2006, CoNEXT '06.

[36]  Qi Zhang,et al.  Eavesdropping on Fine-Grained User Activities Within Smartphone Apps Over Encrypted Network Traffic , 2016, WOOT.

[37]  Jaime Lloret,et al.  Network Traffic Classifier With Convolutional and Recurrent Neural Networks for Internet of Things , 2017, IEEE Access.

[38]  Yun-Chun Chen,et al.  Deep learning for malicious flow detection , 2017, 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC).

[39]  Dan Zhang,et al.  An efficient feature generation approach based on deep learning and feature selection techniques for traffic classification , 2018, Comput. Networks.

[40]  Hannes Federrath,et al.  Website fingerprinting: attacking popular privacy enhancing technologies with the multinomial naïve-bayes classifier , 2009, CCSW '09.

[41]  Yuan-Cheng Lai,et al.  Application classification using packet size distribution and port association , 2009, J. Netw. Comput. Appl..

[42]  Antonio Pescapè,et al.  Multi-classification approaches for classifying mobile app traffic , 2018, J. Netw. Comput. Appl..