Know your Big Data Trade-offs when Classifying Encrypted Mobile Traffic with Deep Learning

The spread of handheld devices has led to the unprecedented growth of traffic volumes traversing both local networks and the Internet, appointing mobile traffic classification as a key tool for gathering highly-valuable profiling information, other than traffic engineering and service management. However, the nature of mobile traffic severely challenges state-of-art Machine-Learning (ML) approaches, since the quickly evolving and expanding set of apps generating traffic hinders ML-based approaches, that require domain-expert design. Deep Learning (DL) represents a promising solution to this issue, but results in higher completion times, in turn suggesting the application of the Big-Data (BD) paradigm. In this paper, we investigate for the first time BD-enabled classification of encrypted mobile traffic using DL from a general standpoint, (a) defining general design guidelines, (b) leveraging a public-cloud platform, and (c) resorting to a realistic experimental setup. We found that, while BD represents a transparent accelerator for some tasks, this is not the case for the training phase of DL architectures for traffic classification, requiring a specific BD-informed design. The experimental setup is built upon a three-dimensional investigation path in the BD adoption, namely: (i) completion time, (ii) deployment costs, and (iii) classification performance, highlighting relevant non-trivial trade-offs.

[1]  Narseo Vallina-Rodriguez,et al.  Studying TLS Usage in Android Apps , 2017, CoNEXT.

[2]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[3]  Antonio Pescapè,et al.  Benchmarking big data architectures for social networks data processing using public cloud platforms , 2018, Future Gener. Comput. Syst..

[4]  W. Marsden I and J , 2012 .

[5]  Jesús E. Díaz-Verdejo,et al.  Network traffic application identification based on message size analysis , 2015, J. Netw. Comput. Appl..

[6]  Antonio Pescapè,et al.  Multi-classification approaches for classifying mobile app traffic , 2018, J. Netw. Comput. Appl..

[7]  Tor-Morten Grønli,et al.  Comprehensive Analysis of Innovative Cross-Platform App Development Frameworks , 2017, HICSS.

[8]  Hwee Pink Tan,et al.  Mobile big data analytics using deep learning and apache spark , 2016, IEEE Network.

[9]  Ming Zhu,et al.  End-to-end encrypted traffic classification with one-dimensional convolution neural networks , 2017, 2017 IEEE International Conference on Intelligence and Security Informatics (ISI).

[10]  Bao-Shuh Lin,et al.  Applying Big Data, Machine Learning, and SDN/NFV for 5G Early-Stage Traffic Classification and Network QoS Control , 2018 .

[11]  Mauro Conti,et al.  Robust Smartphone App Identification via Encrypted Network Traffic Analysis , 2017, IEEE Transactions on Information Forensics and Security.

[12]  Jaime Lloret,et al.  Network Traffic Classifier With Convolutional and Recurrent Neural Networks for Internet of Things , 2017, IEEE Access.

[13]  Malcolm I. Heywood,et al.  Smart Phone User Behaviour Characterization Based on Autoencoders and Self Organizing Maps , 2016, 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW).

[14]  Jie Li,et al.  Online Internet Traffic Monitoring and DDoS Attack Detection Using Big Data Frameworks , 2018, 2018 14th International Wireless Communications & Mobile Computing Conference (IWCMC).

[15]  Yong Liao,et al.  SAMPLES: Self Adaptive Mining of Persistent LExical Snippets for Classifying Mobile Application Traffic , 2015, MobiCom.

[16]  Giuseppe Aceto,et al.  Mobile Encrypted Traffic Classification Using Deep Learning: Experimental Evaluation, Lessons Learned, and Challenges , 2019, IEEE Transactions on Network and Service Management.

[17]  Naveen K. Chilamkurti,et al.  Deep Learning: The Frontier for Distributed Attack Detection in Fog-to-Things Computing , 2018, IEEE Communications Magazine.

[18]  Antonio Pescapè,et al.  Issues and future directions in traffic classification , 2012, IEEE Network.

[19]  Christof Fetzer,et al.  Scalable Network Traffic Classification Using Distributed Support Vector Machines , 2015, 2015 IEEE 8th International Conference on Cloud Computing.

[20]  Pedro Casas,et al.  Stream-based Machine Learning for Network Security and Anomaly Detection , 2018, Big-DAMA@SIGCOMM.

[21]  Nuno Neves,et al.  BigFlow: Real-time and reliable anomaly-based intrusion detection for high-speed networks , 2019, Future Gener. Comput. Syst..

[22]  Mahdi Jafari Siavoshani,et al.  Deep packet: a novel approach for encrypted traffic classification using deep learning , 2017, Soft Computing.