SAMLDroid: A Static Taint Analysis and Machine Learning Combined High-Accuracy Method for Identifying Android Apps with Location Privacy Leakage Risks

Insecure applications (apps) are increasingly used to steal users’ location information for illegal purposes, which has aroused great concern in recent years. Although the existing methods, i.e., static and dynamic taint analysis, have shown great merit for identifying such apps, which mainly rely on statically analyzing source code or dynamically monitoring the location data flow, identification accuracy is still under research, since the analysis results contain a certain false positive or true negative rate. In order to improve the accuracy and reduce the misjudging rate in the process of vetting suspicious apps, this paper proposes SAMLDroid, a combined method of static code analysis and machine learning for identifying Android apps with location privacy leakage, which can effectively improve the identification rate compared with existing methods. SAMLDroid first uses static analysis to scrutinize source code to investigate apps with location acquiring intentions. Then it exploits a well-trained classifier and integrates an app’s multiple features to dynamically analyze the pattern and deliver the final verdict about the app’s property. Finally, it is proved by conducting experiments, that the accuracy rate of SAMLDroid is up to 98.4%, which is nearly 20% higher than Apparecium.

[1]  Dirk Westhoff,et al.  QuantDroid: Quantitative approach towards mitigating privilege escalation on Android , 2013, 2013 IEEE International Conference on Communications (ICC).

[2]  Sankardas Roy,et al.  Deep Ground Truth Analysis of Current Android Malware , 2017, DIMVA.

[3]  Jacques Klein,et al.  IccTA: Detecting Inter-Component Privacy Leaks in Android Apps , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[4]  Kim-Kwang Raymond Choo,et al.  Achieving high performance and privacy-preserving query over encrypted multidimensional big metering data , 2018, Future Gener. Comput. Syst..

[5]  Di Wu,et al.  DeepFlow: Deep learning-based malware detection by mining Android application for abnormal usage of sensitive data , 2017, 2017 IEEE Symposium on Computers and Communications (ISCC).

[6]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[7]  Jacques Klein,et al.  AndroZoo: Collecting Millions of Android Apps for the Research Community , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[8]  Jeff H. Perkins,et al.  Information Flow Analysis of Android Applications in DroidSafe , 2015, NDSS.

[9]  Yuanfang Guo,et al.  Contextual approach for identifying malicious Inter-Component privacy leaks in Android apps , 2017, 2017 IEEE Symposium on Computers and Communications (ISCC).

[10]  Xiaodong Lin,et al.  Detecting GPS information leakage in Android applications , 2013, 2013 IEEE Global Communications Conference (GLOBECOM).

[11]  Eric Bodden,et al.  StubDroid: Automatic Inference of Precise Data-Flow Summaries for the Android Framework , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[12]  Dharma P. Agrawal,et al.  Handbook of Research on Modern Cryptographic Solutions for Computer and Cyber Security , 2016 .

[13]  Gianluca Dini,et al.  MADAM: Effective and Efficient Behavior-based Android Malware Detection and Prevention , 2018, IEEE Transactions on Dependable and Secure Computing.

[14]  Julian Schütte,et al.  Apparecium: Revealing Data Flows in Android Applications , 2015, 2015 IEEE 29th International Conference on Advanced Information Networking and Applications.

[15]  Sankardas Roy,et al.  Amandroid: A Precise and General Inter-component Data Flow Analysis Framework for Security Vetting of Android Apps , 2014, CCS.

[16]  Zhenkai Liang,et al.  Monet: A User-Oriented Behavior-Based Malware Variants Detection System for Android , 2016, IEEE Transactions on Information Forensics and Security.

[17]  Seungyeop Han,et al.  TaintDroid , 2010, OSDI.

[18]  Jacques Klein,et al.  FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps , 2014, PLDI.

[19]  Romit Roy Choudhury,et al.  Hiding stars with fireworks: location privacy through camouflage , 2009, MobiCom '09.

[20]  Cyrus Shahabi,et al.  Blind Evaluation of Nearest Neighbor Queries Using Space Transformation to Preserve Location Privacy , 2007, SSTD.

[21]  Frank Stajano,et al.  Mix zones: user privacy in location-aware services , 2004, IEEE Annual Conference on Pervasive Computing and Communications Workshops, 2004. Proceedings of the Second.

[22]  Johannes Fürnkranz,et al.  Decision Tree , 2010, Encyclopedia of Machine Learning and Data Mining.

[23]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..