LeakSemantic: Identifying abnormal sensitive network transmissions in mobile applications

Mobile applications (apps) often transmit sensitive data through network with various intentions. Some transmissions are needed to fulfill the app's functionalities. However, transmissions with malicious receivers may lead to privacy leakage and tend to behave stealthily to evade detection. The problem is twofold: how does one unveil sensitive transmissions in mobile apps, and given a sensitive transmission, how does one determine if it is legitimate? In this paper, we propose LeakSemantic, a framework that can automatically locate abnormal sensitive network transmissions from mobile apps. LeakSemantic consists of a hybrid program analysis component and a machine learning component. Our program analysis component combines static analysis and dynamic analysis to precisely identify sensitive transmissions. Compared to existing taint analysis approaches, LeakSemantic achieves better accuracy with fewer false positives and is able to collect runtime data such as network traffic for each transmission. Based on features derived from the runtime data, machine learning classifiers are built to further differentiate between the legal and illegal disclosures. Experiments show that LeakSemantic achieves 91% accuracy on 2279 sensitive connections from 1404 apps.

[1]  Hyunwoo Choi,et al.  Extractocol: Automatic Extraction of Application-level Protocol Behaviors for Android Applications , 2015, SIGCOMM 2015.

[2]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[3]  Wenke Lee,et al.  Checking More and Alerting Less: Detecting Privacy Leakages via Enhanced Data-flow Analysis and Peer Voting , 2015, NDSS.

[4]  Yanick Fratantonio,et al.  ANDRUBIS -- 1,000,000 Apps Later: A View on Current Android Malware Behaviors , 2014, 2014 Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS).

[5]  Arun Raghuramu,et al.  Uncovering the Footprints of Malicious Traffic in Cellular Data Networks , 2015, PAM.

[6]  Patrice Godefroid,et al.  Compositional dynamic test generation , 2007, POPL '07.

[7]  Arnaud Legout,et al.  ReCon: Revealing and Controlling PII Leaks in Mobile Network Traffic , 2015, MobiSys.

[8]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[9]  Tao Xie,et al.  AppContext: Differentiating Malicious and Benign Mobile App Behaviors Using Context , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[10]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[11]  Lawrence K. Saul,et al.  Beyond blacklists: learning to detect malicious web sites from suspicious URLs , 2009, KDD.

[12]  Mu Zhang,et al.  Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs , 2014, CCS.

[13]  Jacques Klein,et al.  IccTA: Detecting Inter-Component Privacy Leaks in Android Apps , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[14]  Peng Wang,et al.  AsDroid: detecting stealthy behaviors in Android applications by user interface and program behavior contradiction , 2014, ICSE.

[15]  Parth H. Pathak,et al.  FlowIntent: Detecting Privacy Leakage from User Intention to Network Traffic Mapping , 2016, 2016 13th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).

[16]  Julia Rubin,et al.  A Bayesian Approach to Privacy Enforcement in Smartphones , 2014, USENIX Security Symposium.

[17]  Eric Bodden,et al.  A Machine-learning Approach for Classifying and Categorizing Android Sources and Sinks , 2014, NDSS.

[18]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[19]  Yuan Zhang,et al.  AppIntent: analyzing sensitive data transmission in android for privacy leakage detection , 2013, CCS.

[20]  Jacques Klein,et al.  FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps , 2014, PLDI.

[21]  Christopher Krügel,et al.  TriggerScope: Towards Detecting Logic Bombs in Android Applications , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[22]  Xin Chen,et al.  DroidJust: automated functionality-aware privacy leakage analysis for Android applications , 2015, WISEC.

[23]  Jeff H. Perkins,et al.  Information Flow Analysis of Android Applications in DroidSafe , 2015, NDSS.

[24]  David Lie,et al.  IntelliDroid: A Targeted Input Generator for the Dynamic Analysis of Android Malware , 2016, NDSS.

[25]  Xue Liu,et al.  Effective Real-Time Android Application Auditing , 2015, 2015 IEEE Symposium on Security and Privacy.

[26]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[27]  Eric Bodden,et al.  Harvesting Runtime Values in Android Applications That Feature Anti-Analysis Techniques , 2016, NDSS.