Encrypted Traffic Classification Based on Unsupervised Learning in Cellular Radio Access Networks

Traffic classification will be a key aspect in the operation of future 5G cellular networks, where services of very different nature will coexist. Unfortunately, data encryption makes this task very difficult. To overcome this issue, flow-based schemes have been proposed based on payload-independent features extracted from the Internet Protocol (IP) traffic flow. However, such an approach relies on the use of expensive traffic probes in the core network. Alternatively, in this paper, an offline method for encrypted traffic classification in the radio interface is presented. The method divides connections per service class by analyzing only features in radio connection traces collected by base stations. For this purpose, it relies on unsupervised learning, namely agglomerative hierarchical clustering. Thus, it can be applied in the absence of labeled data (seldom available in operational cellular networks). Likewise, it can also identify new services launched in the network. Method assessment is performed over a real trace dataset taken from a live Long Term Evolution (LTE) network. Results show that traffic shares per application class estimated by the proposed method are similar to those provided by a vendor report.

[1]  Mauro Conti,et al.  AppScanner: Automatic Fingerprinting of Smartphone Apps from Encrypted Network Traffic , 2016, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[2]  Salvador Luna-Ramírez,et al.  Complex Event Processing for Self-Optimizing Cellular Networks , 2020, Sensors.

[3]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[4]  Matías Toril,et al.  A Trace Data-Based Approach for an Accurate Estimation of Precise Utilization Maps in LTE , 2017, Mob. Inf. Syst..

[5]  Jinfu Chen,et al.  A novel algorithm for encrypted traffic classification based on sliding window of flow's first N packets , 2017, 2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA).

[6]  Zhixin Sun,et al.  A Survey of Techniques for Mobile Service Encrypted Traffic Classification Using Deep Learning , 2019, IEEE Access.

[7]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[8]  Matías Toril,et al.  Estimating Spectral Efficiency Curves from Connection Traces in a Live LTE Network , 2017, Mob. Inf. Syst..

[9]  Matías Toril,et al.  Big Data Analytics for Automated QoE Management in Mobile Networks , 2019, IEEE Communications Magazine.

[10]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[11]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[12]  Pere Barlet-Ros,et al.  Independent comparison of popular DPI tools for traffic classification , 2015, Comput. Networks.

[13]  Giuseppe Aceto,et al.  MIMETIC: Mobile encrypted traffic classification using multimodal deep learning , 2019, Comput. Networks.

[14]  Scott E. Coull,et al.  Traffic Analysis of Encrypted Messaging Services: Apple iMessage and Beyond , 2014, CCRV.

[15]  Giuseppe Aceto,et al.  Mobile Encrypted Traffic Classification Using Deep Learning: Experimental Evaluation, Lessons Learned, and Challenges , 2019, IEEE Transactions on Network and Service Management.

[16]  Raquel Barco,et al.  Automatic root cause analysis based on traces for LTE self-organizing networks , 2016, IEEE Wireless Communications.

[17]  Joydeep Ghosh,et al.  Data Clustering Algorithms And Applications , 2013 .

[18]  Hans-Friedrich Köhn,et al.  Cluster analysis: A toolbox for MATLAB. , 2009 .

[19]  Akihiro Nakao,et al.  Toward In-Network Deep Machine Learning for Identifying Mobile Applications and Enabling Application Specific Network Slicing , 2018, IEICE Trans. Commun..

[20]  Mauro Conti,et al.  Robust Smartphone App Identification via Encrypted Network Traffic Analysis , 2017, IEEE Transactions on Information Forensics and Security.

[21]  Lior Rokach,et al.  Clustering Methods , 2005, The Data Mining and Knowledge Discovery Handbook.

[22]  Lazaros F. Merakos,et al.  Quality of experience management in mobile cellular networks: key issues and design challenges , 2015, IEEE Communications Magazine.

[23]  Stefania Sesia,et al.  LTE - The UMTS Long Term Evolution, Second Edition , 2011 .

[24]  Eduardo Rocha,et al.  A Survey of Payload-Based Traffic Classification Approaches , 2014, IEEE Communications Surveys & Tutorials.

[25]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[26]  Matías Toril,et al.  Analysis of Throughput Performance Statistics for Benchmarking LTE Networks , 2014, IEEE Communications Letters.

[27]  Salvador Luna-Ramírez,et al.  A Data-Driven Traffic Steering Algorithm for Optimizing User Experience in Multi-Tier LTE Networks , 2019, IEEE Transactions on Vehicular Technology.

[28]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[29]  Peter Schelkens,et al.  Qualinet White Paper on Definitions of Quality of Experience , 2013 .

[30]  Qi Zhang,et al.  Eavesdropping on Fine-Grained User Activities Within Smartphone Apps Over Encrypted Network Traffic , 2016, WOOT.

[31]  Michael Seufert,et al.  Streaming Characteristics of Spotify Sessions , 2018, 2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX).

[32]  Wei Lin,et al.  Traffic Identification of Mobile Apps Based on Variational Autoencoder Network , 2017, 2017 13th International Conference on Computational Intelligence and Security (CIS).

[33]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[34]  Matías Toril,et al.  Self-tuning of Remote Electrical Tilts Based on Call Traces for Coverage and Capacity Optimization in LTE , 2017, IEEE Transactions on Vehicular Technology.

[35]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[36]  Fahim Kawsar,et al.  Energy Efficient Scheduling for Mobile Push Notifications , 2015, EAI Endorsed Trans. Energy Web.

[37]  Jorge Navarro-Ortiz,et al.  Analysis and modelling of YouTube traffic , 2012, Trans. Emerg. Telecommun. Technol..

[38]  Antonio Pescapè,et al.  Multi-classification approaches for classifying mobile app traffic , 2018, J. Netw. Comput. Appl..

[39]  Bao-Shuh Paul Lin,et al.  On the classification of mobile broadband applications , 2016, 2016 IEEE 21st International Workshop on Computer Aided Modelling and Design of Communication Links and Networks (CAMAD).

[40]  Ivan Martinovic,et al.  Who do you sync you are?: smartphone fingerprinting via application behaviour , 2013, WiSec '13.

[41]  Feng Qian,et al.  An in-depth study of LTE: effect of network protocol and application behavior on performance , 2013, SIGCOMM.

[42]  Matías Toril,et al.  Self-Optimization Algorithm for Outer Loop Link Adaptation in LTE , 2015, IEEE Communications Letters.

[43]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[44]  Hui Xiong,et al.  Service Usage Classification with Encrypted Internet Traffic in Mobile Messaging Apps , 2016, IEEE Transactions on Mobile Computing.

[45]  Vipin Kumar,et al.  The Challenges of Clustering High Dimensional Data , 2004 .