Poster: Data Collection for ML Classification of Encrypted Messaging Applications

Network traffic classification is used to identify the nature of traffic on a network. Entities capable of monitoring net-work traffic use classification for all manner of reasons, including identification of mobile applications being used on the network. It is possible that the usage of encrypted messaging applications by users on these networks can be detected, betraying elements of their privacy.In this paper, we describe a system that leverages campus network resources to generate real-world data alongside a more curated dataset captured from Android application traffic. We also explore the ability of machine learning (ML) models to accurately classify traffic from these encrypted messaging applications. Understanding what is revealed from network data is important given that the use of these applications is meant to maximize privacy in the first place.

[1]  Giuseppe Aceto,et al.  MIRAGE: Mobile-app Traffic Capture and Ground-truth Creation , 2019, 2019 4th International Conference on Computing, Communications and Security (ICCCS).

[2]  Junhua Yan,et al.  Feature Selection for Website Fingerprinting , 2018, Proc. Priv. Enhancing Technol..