Benchmark Data for Mobile App Traffic Research

Mobile app traffic classification aims to automatically map mobile packets into apps. It has become an active task in mobile traffic engineering, and numerous algorithms have been proposed for this task, including machine learning, deep packet inspection methods. However, existing works mainly evaluate their methods on their own collected mobile traffic traces. There is no public benchmark data. The results in existing papers cannot be directly compared. This largely limits the development of mobile app traffic classification methods. This paper describes our Mobile Traffic Data(MTD): Android app traffic flow sample sets with ground truth. The goal of MTD is to advance the state-of-arts in mobile app traffic classification. For building MTD, we collected and annotated more than ten thousands of traffic flows using Mobilegt system. The popularity used flow features were also extracted to build flow samples for mobile traffic classification using machine learning. MTD sets have been shared in public. In addition, this paper provides the performance analysis of typical machine learning techniques on MTD, which can be served as the baseline results on this benchmark data.

[1]  Shigeki Goto,et al.  Passive Smart Phone Indentification and Tracking with Application Set Fingerprints , 2013 .

[2]  Marco Canini,et al.  Efficient application identification and the temporal and spatial stability of classification schema , 2009, Comput. Networks.

[3]  Yong Liao,et al.  AppPrint: Automatic Fingerprinting of Mobile Applications in Network Traffic , 2015, PAM.

[4]  Yong Liao,et al.  SAMPLES: Self Adaptive Mining of Persistent LExical Snippets for Classifying Mobile Application Traffic , 2015, MobiCom.

[5]  Xin Yao,et al.  Multiclass Imbalance Problems: Analysis and Potential Solutions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[6]  Jun Zhang,et al.  Network Traffic Classification Using Correlation Information , 2013, IEEE Transactions on Parallel and Distributed Systems.

[7]  Zhen Liu,et al.  Classifying imbalanced Internet traffic based PCDD: a per concept drift detection method , 2013, Smart Comput. Rev..

[8]  Zhen Liu,et al.  Classifying imbalanced Internet traffic based PCDD , 2013 .

[9]  Kan Li,et al.  Fuzzy competence model drift detection for data-driven decision support systems , 2017, Knowl. Based Syst..

[10]  Zhen Liu,et al.  A System for Linking Ground Truth to Mobile Network Traffic , 2016, MobiQuitous.

[11]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[12]  Pere Barlet-Ros,et al.  Independent comparison of popular DPI tools for traffic classification , 2015, Comput. Networks.

[13]  Gang Lu,et al.  Feature selection for optimizing traffic classification , 2012, Comput. Commun..

[14]  Kensuke Fukuda,et al.  Enhancing the Performance of Mobile Traffic Identification with Communication Patterns , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[15]  Toru Abe,et al.  Traffic Classification in Mobile IP Network , 2009, Proceedings of the 4th International Conference on Ubiquitous Information Technologies & Applications.

[16]  Qiang Xu,et al.  Automatic generation of mobile app signatures from traffic observations , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[17]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[18]  Antonio Nucci,et al.  Towards self adaptive network traffic classification , 2015, Comput. Commun..

[19]  Yiqing Zhou,et al.  Maximum entropy based IP-traffic classification in mobile communication networks , 2012, 2012 IEEE Wireless Communications and Networking Conference (WCNC).

[20]  Nathalie Japkowicz,et al.  Adaptive learning on mobile network traffic data , 2018, Connect. Sci..

[21]  Zhen Liu,et al.  Large traffic flows classification method , 2014, 2014 IEEE International Conference on Communications Workshops (ICC).

[22]  Nino Vincenzo Verde,et al.  Analyzing Android Encrypted Network Traffic to Identify User Actions , 2016, IEEE Transactions on Information Forensics and Security.

[23]  Zhen Liu,et al.  SmoteAdaNL: a learning method for network traffic classification , 2016, J. Ambient Intell. Humaniz. Comput..

[24]  Shigehiro Ano,et al.  Traffic classification on mobile core network considering regularity of background traffic , 2015, 2015 IEEE International Workshop Technical Committee on Communications Quality and Reliability (CQR).

[25]  Maurizio Dusi,et al.  Quantifying the accuracy of the ground truth associated with Internet traffic traces , 2011, Comput. Networks.

[26]  Niccolo Cascarano,et al.  GT: picking up the truth from the ground for internet traffic , 2009, CCRV.

[27]  Wei Li,et al.  Efficient Application Identification and the Temporal Stability of Classification Schema , 2009 .