Orlis: Obfuscation-Resilient Library Detection for Android

Android apps often contain third-party libraries. For many program analyses, it is important to identify the library code in a given closed-source Android app. There are several clients of such library detection, including security analysis, clone/repackage detection, and library removal/isolation. However, library detection is complicated significantly by commonly-used code obfuscation techniques for Android. Although some of the state-of-the-art library detection tools are intended to be resilient to obfuscation, there is still room to improve recall, precision, and analysis cost. We propose a new approach to detect third-party libraries in obfuscated apps. The approach relies on obfuscation-resilient code features derived from the interprocedural structure and behavior of the app (e.g., call graphs of methods). The design of our approach is informed by close examination of the code features preserved by typical Android obfuscators. To reduce analysis cost, we use similarity digests as an efficient mechanism for identifying a small number of likely matches. We implemented this approach in the Orlis library detection tool. As demonstrated by our experimental results, Orlis advances the state of the art and presents an attractive choice for detection of third-party libraries in Android apps.

[1]  Michael Eichberg,et al.  CodeMatch: obfuscation won't conceal your repackaged app , 2017, ESEC/SIGSOFT FSE.

[2]  Matthew Smith,et al.  To Pin or Not to Pin-Helping App Developers Bullet Proof Their TLS Connections , 2015, USENIX Security Symposium.

[3]  Christopher Krügel,et al.  Execute This! Analyzing Unsafe and Malicious Dynamic Code Loading in Android Applications , 2014, NDSS.

[4]  Erik Derr,et al.  R-Droid: Leveraging Android App Analysis with Static Slice Optimization , 2016, AsiaCCS.

[5]  Felix C. Freiling,et al.  An Empirical Evaluation of Software Obfuscation Techniques Applied to Android APKs , 2014, SecureComm.

[6]  Peng Liu,et al.  Achieving accuracy and scalability simultaneously in detecting application clones on Android markets , 2014, ICSE.

[7]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[8]  Juanru Li,et al.  APKLancet: tumor payload diagnosis and purification for android applications , 2014, AsiaCCS.

[9]  Golden G. Richard,et al.  Multi-resolution similarity hashing , 2007, Digit. Investig..

[10]  Sam Malek,et al.  A Large-Scale Empirical Study on the Effects of Code Obfuscations on Android Apps and Anti-Malware Products , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[11]  David A. Wagner,et al.  Android Permissions Remystified: A Field Study on Contextual Integrity , 2015, USENIX Security Symposium.

[12]  Jeff H. Perkins,et al.  Information Flow Analysis of Android Applications in DroidSafe , 2015, NDSS.

[13]  Xuxian Jiang,et al.  Unsafe exposure analysis of mobile in-app advertisements , 2012, WISEC '12.

[14]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[15]  Haoyu Wang,et al.  LibRadar: Fast and Accurate Detection of Third-Party Libraries in Android Apps , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[16]  Scott Forman,et al.  Using Randomization to Attack Similarity Digests , 2014 .

[17]  Jacques Klein,et al.  An Investigation into the Use of Common Libraries in Android Apps , 2015, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[18]  Anirban Dasgupta,et al.  Fast locality-sensitive hashing , 2011, KDD.

[19]  Jian Liu,et al.  LibD: Scalable and Precise Third-Party Library Detection in Android Markets , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[20]  Haoyu Wang,et al.  WuKong: a scalable and accurate two-phase approach to Android app clone detection , 2015, ISSTA.

[21]  Annamalai Narayanan,et al.  AdDetect: Automated detection of Android ad libraries using semantic analysis , 2014, 2014 IEEE Ninth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP).

[22]  Ernesto Damiani,et al.  An Open Digest-based Technique for Spam Detection , 2004, PDCS.

[23]  Insik Shin,et al.  FLEXDROID: Enforcing In-App Privilege Separation in Android , 2016, NDSS.

[24]  Jonathan Oliver,et al.  TLSH -- A Locality Sensitive Hash , 2013, 2013 Fourth Cybercrime and Trustworthy Computing Workshop.

[25]  David Brumley,et al.  An empirical study of cryptographic misuse in android applications , 2013, CCS.

[26]  Sankardas Roy,et al.  Amandroid: A Precise and General Inter-component Data Flow Analysis Framework for Security Vetting of Android Apps , 2014, CCS.

[27]  Yajin Zhou,et al.  Detecting repackaged smartphone applications in third-party android marketplaces , 2012, CODASPY '12.

[28]  Jesse D. Kornblum Identifying almost identical files using context triggered piecewise hashing , 2006, Digit. Investig..

[29]  Bin Ma,et al.  Following Devil's Footprints: Cross-Platform Analysis of Potentially Harmful Libraries on Android and iOS , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[30]  Jacques Klein,et al.  FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps , 2014, PLDI.

[31]  Hongxia Jin,et al.  Efficient Privilege De-Escalation for Ad Libraries in Mobile Apps , 2015, MobiSys.

[32]  Yixin Chen,et al.  md5bloom: Forensic filesystem hashing revisited , 2006, Digit. Investig..

[33]  Erik Derr,et al.  Reliable Third-Party Library Detection in Android and its Security Applications , 2016, CCS.

[34]  Vassil Roussev,et al.  Hashing and Data Fingerprinting in Digital Forensics , 2009, IEEE Security & Privacy.

[35]  Dan S. Wallach,et al.  Longitudinal Analysis of Android Ad Library Permissions , 2013, ArXiv.

[36]  Hao Chen,et al.  AnDarwin: Scalable Detection of Android Application Clones Based on Semantics , 2015, IEEE Transactions on Mobile Computing.

[37]  Swarat Chaudhuri,et al.  A Study of Android Application Security , 2011, USENIX Security Symposium.

[38]  Petar Tsankov,et al.  Statistical Deobfuscation of Android Applications , 2016, CCS.

[39]  Hao Chen,et al.  Attack of the Clones: Detecting Cloned Applications on Android Markets , 2012, ESORICS.

[40]  Yan Wang,et al.  Who Changed You? Obfuscator Identification for Android , 2017, 2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[41]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[42]  Jason Nieh,et al.  A measurement study of google play , 2014, SIGMETRICS '14.

[43]  Taras Dasho State and civil society: legal cooperation , 2017 .