Semantics-Based Repackaging Detection for Mobile Apps

While Android app stores keep growing in size and in number, app repackaging has become a major threat to the health of the mobile ecosystem. Different from many syntax-based repackaging detection techniques, in this work we propose a semantic-based approach, RepDetector, which is more robust against code obfuscation attacks. To capture an app's semantics, our approach extracts input-output states of core functions in the app and then compare function and app similarity. We implement a prototype of RepDetector, and evaluate it against various obfuscation technologies. The results show that our approach can detect repackaged apps effectively. It is also at least a hundred times faster than Androguard.

[1]  Christian S. Collberg,et al.  Sandmark--A Tool for Software Protection Research , 2003, IEEE Secur. Priv..

[2]  Sencun Zhu,et al.  A Framework for Evaluating Mobile App Repackaging Detection Algorithms , 2013, TRUST.

[3]  Hao Chen,et al.  Attack of the Clones: Detecting Cloned Applications on Android Markets , 2012, ESORICS.

[4]  Fangfang Zhang,et al.  A first step towards algorithm plagiarism detection , 2012, ISSTA 2012.

[5]  Yajin Zhou,et al.  Fast, scalable detection of "Piggybacked" mobile applications , 2013, CODASPY.

[6]  Karl Trygve Kalleberg,et al.  Finding software license violations through binary code clone detection , 2011, MSR '11.

[7]  Peng Liu,et al.  Achieving accuracy and scalability simultaneously in detecting application clones on Android markets , 2014, ICSE.

[8]  Daniel Shawcross Wilkerson,et al.  Winnowing: local algorithms for document fingerprinting , 2003, SIGMOD '03.

[9]  Steve Hanna,et al.  Juxtapp: A Scalable System for Detecting Code Reuse among Android Applications , 2012, DIMVA.

[10]  Yajin Zhou,et al.  Detecting repackaged smartphone applications in third-party android marketplaces , 2012, CODASPY '12.

[11]  J. Crussell,et al.  Scalable Semantics-Based Detection of Similar Android Applications , 2013 .

[12]  Xiangyu Zhang,et al.  Plagiarizing Smartphone Applications: Attack Strategies and Defense Techniques , 2012, ESSoS.

[13]  Cesare Tinelli,et al.  Leveraging linear and mixed integer programming for SMT , 2014, 2014 Formal Methods in Computer-Aided Design (FMCAD).

[14]  Christian S. Collberg,et al.  K-gram based software birthmarks , 2005, SAC '05.

[15]  Guofei Gu,et al.  SmartDroid: an automatic system for revealing UI-based trigger conditions in android applications , 2012, SPSM '12.

[16]  Hyun-il Lim,et al.  Detecting Theft of Java Applications via a Static Birthmark Based on Weighted Stack Patterns , 2008, IEICE Trans. Inf. Syst..

[17]  Sencun Zhu,et al.  ViewDroid: towards obfuscation-resilient mobile application repackaging detection , 2014, WiSec '14.

[18]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[19]  Ryan Stevens,et al.  Characterizing Android Application Plagiarism and its Impact on Developers. , 2012 .

[20]  Sencun Zhu,et al.  Detecting Software Theft via System Call Based Birthmarks , 2009, 2009 Annual Computer Security Applications Conference.

[21]  Peng Wang,et al.  Finding Unknown Malice in 10 Seconds: Mass Vetting for New Threats at the Google-Play Scale , 2015, USENIX Security Symposium.

[22]  Lei Zhang,et al.  Towards a scalable resource-driven approach for detecting repackaged Android applications , 2014, ACSAC.

[23]  G. Rothermel International Symposium on Software Testing and Analysis , 2013 .

[24]  Sencun Zhu,et al.  Behavior based software theft detection , 2009, CCS.

[25]  Hao Chen,et al.  AnDarwin: Scalable Detection of Semantically Similar Android Applications , 2013, ESORICS.

[26]  D. Massart,et al.  The Mahalanobis distance , 2000 .

[27]  Sencun Zhu,et al.  Value-based program characterization and its application to software plagiarism detection , 2011, 2011 33rd International Conference on Software Engineering (ICSE).