DroidCollector: A High Performance Framework for High Quality Android Traffic Collection

In this mobile era, people have become increasingly dependent on smart devices. Smartphones have emerged as the most popular smart computing device. However, numerous security issues affecting smartphones have been exposed. In recent years, mobile network traffic based approaches have been proposed to identify malware malicious behaviors, but these approaches, especially the approaches using machine learning methods are largely constrained by the difficulty of mobile traffic dataset collection. Without sufficient and effective mobile traffic dataset, research focusing on mobile network traffic will be hindered. This study introduces DroidCollector, a high performance framework for high quality Android traffic collection. This framework leverages multithreading to perform active and automatic network traffic collection. Using this framework, we collect 808 MB and 330 MB traffic data generated by 6000 benign apps and 5560 malicious apps in a short period of time, respectively. The collected high quality traffic is mostly generated from apps and with little irrelevant traffic. We also apply machine learning algorithm on the extracted traffic features to identify malicious network behaviors. The experimental result shows that it can achieve a malicious traffic detection rate of 98% on average.

[1]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[2]  Qiang Xu,et al.  Automatic generation of mobile app signatures from traffic observations , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[3]  Deborah Estrin,et al.  A first look at traffic on smartphones , 2010, IMC '10.

[4]  Anja Feldmann,et al.  A First Look at Mobile Hand-Held Device Traffic , 2010, PAM.

[5]  Ninghui Li,et al.  Android permissions: a perspective combining risks and benefits , 2012, SACMAT '12.

[6]  Aleksandar Kuzmanovic,et al.  Measuring serendipity: connecting people, locations and interests in a mobile 3G network , 2009, IMC '09.

[7]  Sakir Sezer,et al.  Analysis of Bayesian classification-based approaches for Android malware detection , 2016, IET Inf. Secur..

[8]  Vern Paxson,et al.  Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[9]  Songwu Lu,et al.  SmartSiren: virus detection and alert for smartphones , 2007, MobiSys '07.

[10]  Lei Zhang,et al.  A First Look at Android Malware Traffic in First Few Minutes , 2015, TrustCom 2015.

[11]  I. Song,et al.  Working Set Selection Using Second Order Information for Training Svm, " Complexity-reduced Scheme for Feature Extraction with Linear Discriminant Analysis , 2022 .

[12]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..