Effects of Code Obfuscation on Android App Similarity Analysis

Code obfuscation is a technique to transform a program into an equivalent one that is harder to be reverse engineered and understood. On Android, well-known obfuscation techniques are shrinking, optimization, renaming, string encryption, control flow transformation, etc. On the other hand, adversaries may also maliciously use obfuscation techniques to hide pirated or stolen software. If pirated software were obfuscated, it would be difficult to detect software theft. To detect illegal software transformed by code obfuscation, one possible approach is to measure software similarity between original and obfuscated programs and determine whether the obfuscated version is an illegal copy of the original version. In this paper, we analyze empirically the effects of code obfuscation on Android app similarity analysis. The empirical measurements were done on five different Android apps with DashO obfuscator. Experimental results show that similarity measures at bytecode level are more effective than those at source code level to analyze software similarity.

[1]  Gregory Wroblewski,et al.  General Method of Program Code Obfuscation , 2002 .

[2]  Nasir D. Memon,et al.  Preventing Piracy, Reverse Engineering, and Tampering , 2003, Computer.

[3]  Christian S. Collberg,et al.  A Taxonomy of Obfuscating Transformations , 1997 .

[4]  Christian S. Collberg,et al.  K-gram based software birthmarks , 2005, SAC '05.

[5]  Udi Manber,et al.  Deducing Similarities in Java Sources from Bytecodes , 1998, USENIX Annual Technical Conference.

[6]  Hugo Gonzalez,et al.  Enriching reverse engineering through visual exploration of Android binaries , 2015, PPREW@ACSAC.

[7]  Christian S. Collberg,et al.  Detecting Software Theft via Whole Program Path Birthmarks , 2004, ISC.

[8]  Christian S. Collberg,et al.  Watermarking, Tamper-Proofing, and Obfuscation-Tools for Software Protection , 2002, IEEE Trans. Software Eng..

[9]  K.W. Bowyer,et al.  Experience using "MOSS" to detect cheating on programming assignments , 1999, FIE'99 Frontiers in Education. 29th Annual Frontiers in Education Conference. Designing the Future of Science and Engineering Education. Conference Proceedings (IEEE Cat. No.99CH37011.

[10]  Florin Buzatu Methods for Obfuscating Java Programs , 2012 .

[11]  Seong-je Cho,et al.  Measuring similarity of android applications via reversing and K-gram birthmarking , 2013, RACS.

[12]  Yajin Zhou,et al.  Detecting repackaged smartphone applications in third-party android marketplaces , 2012, CODASPY '12.

[13]  Akito Monden,et al.  Design and evaluation of birthmarks for detecting theft of java programs , 2004, IASTED Conf. on Software Engineering.