Scalable and Obfuscation-Resilient Android App Repackaging Detection Based on Behavior Birthmark

Repackaged Android apps are the major source of Android malware, which not only compromise the pecuniary profit of original authors, but also pose threat to security and privacy of mobile users. Although a large number of birthmark based approaches have been proposed for Android repackaging detection, the majority of them heavily rely on the code instruction details, thus suffering from the following two limitations: (1) subject to code/resource obfuscation technologies; (2) fail to large scale repackaging detection. In this paper, we propose a novel behavior based approach for Android repackaging detection to meet scalability and obfuscation-resilience at the same time. As the repackaged app always keeps the basic functionalities of the original one for leveraging its popularity, they usually have similar behaviors. This observation inspires us to design the new behavior based birthmark for Android repackaging detection, namely, API dependency graph. To further improve the detection performance, we also introduce a system dependency summary graph based ADG extraction approach for high efficiency birthmark construction. We implement a prototype system named ACFinder and evaluate our system using 13,917 apps of 22 categories collected from APK-DL. Experiments show that ACFinder can extract behavior birthmark efficiently (average 52.9s per app), and that our behavior birthmark is resilient to complex code obfuscation technologies (average app similarity all are 1.0 for 11 code obfuscation algorithms) and capable to large scale detection (average 0.37s per app pair).

[1]  Patrick Cousot,et al.  Andromeda: Accurate and Scalable Security Analysis of Web Applications , 2013, FASE.

[2]  Mourad Debbabi,et al.  Fingerprinting Android packaging: Generating DNAs for malware detection , 2016, Digit. Investig..

[3]  Thomas W. Reps,et al.  Precise interprocedural dataflow analysis via graph reachability , 1995, POPL '95.

[4]  Mu Zhang,et al.  Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs , 2014, CCS.

[5]  John C. S. Lui,et al.  DroidEagle: seamless detection of visually similar Android apps , 2015, WISEC.

[6]  Xiapu Luo,et al.  RootGuard: Protecting Rooted Android Phones , 2014, Computer.

[7]  Wenke Lee,et al.  CHEX: statically vetting Android apps for component hijacking vulnerabilities , 2012, CCS.

[8]  Xuxian Jiang,et al.  DroidChameleon: evaluating Android anti-malware against transformation attacks , 2013, ASIA CCS '13.

[9]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1984, TOPL.

[11]  Steve Hanna,et al.  Juxtapp: A Scalable System for Detecting Code Reuse among Android Applications , 2012, DIMVA.

[12]  Yajin Zhou,et al.  Detecting repackaged smartphone applications in third-party android marketplaces , 2012, CODASPY '12.

[13]  Sencun Zhu,et al.  A Framework for Evaluating Mobile App Repackaging Detection Algorithms , 2013, TRUST.

[14]  Yang Liu,et al.  Semantic modelling of Android malware for effective malware comprehension, detection, and classification , 2016, ISSTA.

[15]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1988, SIGP.

[16]  Hao Chen,et al.  Attack of the Clones: Detecting Cloned Applications on Android Markets , 2012, ESORICS.

[17]  Alessandra Gorla,et al.  Mining Apps for Abnormal Usage of Sensitive Data , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[18]  Yajin Zhou,et al.  Fast, scalable detection of "Piggybacked" mobile applications , 2013, CODASPY.

[19]  Hui Zang,et al.  AdRob: examining the landscape and impact of android application plagiarism , 2013, MobiSys '13.

[20]  Peng Liu,et al.  Achieving accuracy and scalability simultaneously in detecting application clones on Android markets , 2014, ICSE.

[21]  Fingerprinting Android packaging: Generating DNAs for malware detection , 2016 .

[22]  Jacques Klein,et al.  An Investigation into the Use of Common Libraries in Android Apps , 2015, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[23]  Kilian Q. Weinberger,et al.  Feature hashing for large scale multitask learning , 2009, ICML '09.

[24]  Haoyu Wang,et al.  WuKong: a scalable and accurate two-phase approach to Android app clone detection , 2015, ISSTA.

[25]  Jacques Klein,et al.  FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps , 2014, PLDI.

[26]  Olga Gadyatskaya,et al.  FSquaDRA: Fast Detection of Repackaged Applications , 2014, DBSec.

[27]  Abhinav Srivastava,et al.  Detecting plagiarized mobile apps using API birthmarks , 2015, Automated Software Engineering.

[28]  Sencun Zhu,et al.  ViewDroid: towards obfuscation-resilient mobile application repackaging detection , 2014, WiSec '14.

[29]  David Brumley,et al.  BitShred: feature hashing malware for scalable triage and semantic analysis , 2011, CCS '11.

[30]  Lei Zhang,et al.  Towards a scalable resource-driven approach for detecting repackaged Android applications , 2014, ACSAC.

[31]  Sencun Zhu,et al.  Behavior based software theft detection , 2009, CCS.

[32]  Hao Chen,et al.  AnDarwin: Scalable Detection of Android Application Clones Based on Semantics , 2015, IEEE Transactions on Mobile Computing.

[33]  Lipo Wang,et al.  Detecting Clones in Android Applications through Analyzing User Interfaces , 2015, 2015 IEEE 23rd International Conference on Program Comprehension.