Understanding Android Obfuscation Techniques: A Large-Scale Investigation in the Wild

In this paper, we seek to better understand Android obfuscation and depict a holistic view of the usage of obfuscation through a large-scale investigation in the wild. In particular, we focus on four popular obfuscation approaches: identifier renaming, string encryption, Java reflection, and packing. To obtain the meaningful statistical results, we designed efficient and lightweight detection models for each obfuscation technique and applied them to our massive APK datasets (collected from Google Play, multiple third-party markets, and malware databases). We have learned several interesting facts from the result. For example, malware authors use string encryption more frequently, and more apps on third-party markets than Google Play are packed. We are also interested in the explanation of each finding. Therefore we carry out in-depth code analysis on some Android apps after sampling. We believe our study will help developers select the most suitable obfuscation approach, and in the meantime help researchers improve code analysis systems in the right direction.

[1]  Fangfang Zhang,et al.  Deviation-Based Obfuscation-Resilient Program Equivalence Checking With Application to Software Plagiarism Detection , 2016, IEEE Transactions on Reliability.

[2]  Mansour Ahmadi,et al.  DroidSieve: Fast and Accurate Classification of Obfuscated Android Malware , 2017, CODASPY.

[3]  Sencun Zhu,et al.  Android malware development on public malware scanning platforms: A large-scale data-driven study , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[4]  Sencun Zhu,et al.  ViewDroid: towards obfuscation-resilient mobile application repackaging detection , 2014, WiSec '14.

[5]  Jean-Yves Marion,et al.  Aligot: cryptographic function identification in obfuscated binary programs , 2012, CCS.

[6]  Xiapu Luo,et al.  DexHunter: Toward Extracting Hidden Code from Packed Android Applications , 2015, ESORICS.

[7]  Yajin Zhou,et al.  Detecting repackaged smartphone applications in third-party android marketplaces , 2012, CODASPY '12.

[8]  Juanru Li,et al.  AppSpear: Bytecode Decrypting and DEX Reassembling for Packed Android Malware , 2015, RAID.

[9]  Xuxian Jiang,et al.  DroidChameleon: evaluating Android anti-malware against transformation attacks , 2013, ASIA CCS '13.

[10]  Peng Liu,et al.  Achieving accuracy and scalability simultaneously in detecting application clones on Android markets , 2014, ICSE.

[11]  Vrizlynn L. L. Thing,et al.  Control flow obfuscation for Android applications , 2016, Comput. Secur..

[12]  Mu Zhang,et al.  Things You May Not Know About Android (Un)Packers: A Systematic Study based on Whole-System Emulation , 2018, NDSS.

[13]  Juanru Li,et al.  Android App Protection via Interpretation Obfuscation , 2014, 2014 IEEE 12th International Conference on Dependable, Autonomic and Secure Computing.

[14]  Haoyu Wang,et al.  WuKong: a scalable and accurate two-phase approach to Android app clone detection , 2015, ISSTA.

[15]  Elmar Gerhards-Padilla,et al.  CIS: The Crypto Intelligence System for automatic detection and localization of cryptographic functions in current malware , 2012, 2012 7th International Conference on Malicious and Unwanted Software.

[16]  Jian Liu,et al.  LibD: Scalable and Precise Third-Party Library Detection in Android Markets , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[17]  Felix C. Freiling,et al.  An Empirical Evaluation of Software Obfuscation Techniques Applied to Android APKs , 2014, SecureComm.

[18]  Jacques Klein,et al.  DroidRA: taming reflection to support whole-program analysis of Android apps , 2016, ISSTA.

[19]  Giorgio Giacinto,et al.  Stealth attacks: An extended insight into the obfuscation effects on Android malware , 2015, Comput. Secur..

[20]  Mila Dalla Preda,et al.  Testing android malware detectors against code obfuscation: a systematization of knowledge and unified methodology , 2016, Journal of Computer Virology and Hacking Techniques.

[21]  Thorsten Holz,et al.  Evaluating Analysis Tools for Android Apps: Status Quo and Robustness Against Obfuscation , 2016, CODASPY.

[22]  Petar Tsankov,et al.  Statistical Deobfuscation of Android Applications , 2016, CCS.

[23]  Matteo Pomilia A study on obfuscation techniques for Android malware , 2016 .

[24]  Vijay Laxmi,et al.  Android Code Protection via Obfuscation Techniques: Past, Present and Future Directions , 2016, ArXiv.

[25]  R. Nigam Covering the global threat landscape OBFUSCATION IN ANDROID MALWARE, AND HOW TO FIGHT BACK , 2014 .

[26]  Sencun Zhu,et al.  Semantics-Based Repackaging Detection for Mobile Apps , 2016, ESSoS.

[27]  Carsten Willems,et al.  Automated Identification of Cryptographic Primitives in Binary Programs , 2011, RAID.

[28]  Muttukrishnan Rajarajan,et al.  Evaluation of Android Anti-malware Techniques against Dalvik Bytecode Obfuscation , 2014, 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications.

[29]  Seong-je Cho,et al.  Effects of Code Obfuscation on Android App Similarity Analysis , 2015, J. Wirel. Mob. Networks Ubiquitous Comput. Dependable Appl..

[30]  Yan Wang,et al.  Who Changed You? Obfuscator Identification for Android , 2017, 2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft).