A Taxonomy of Obfuscating Transformations

It has become more and more common to distribute software in forms that retain most or all of the information present in the original source code. An important example is Java bytecode. Since such codes are easy to decompile, they increase the risk of malicious reverse engineering attacks. In this paper we review several techniques for technical protection of software secrets. We will argue that automatic code obfuscation is currently the most viable method for preventing reverse engineering. We then describe the design of a code obfuscator, a tool which converts a program into an equivalent one that is more diicult to understand and reverse engineer. The obfuscator is based on the application of code transformations, in many cases similar to those used by compiler optimizers. We describe a large number of such transformations, classify them, and evaluate them with respect to their potency (To what degree is a human reader confused?), resilience (How well are automatic deobfuscation attacks resisted?), and cost (How much overhead is added to the application?). We nally discuss some possible deobfuscation techniques (such as program slicing) and possible countermeasures an obfuscator could employ against them.

[1]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[2]  Maurice H. Halstead,et al.  Elements of software science , 1977 .

[3]  Sallie M. Henry,et al.  Software Structure Metrics Based on Information Flow , 1981, IEEE Transactions on Software Engineering.

[4]  Warren A. Harrison,et al.  A complexity measure based on nesting level , 1981, SIGP.

[5]  James R. Gosler,et al.  Software Protection: Myth or Reality? , 1985, CRYPTO.

[6]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[7]  Amir Herzberg,et al.  Public protection of software , 1985, TOCS.

[8]  Linda M. Wills Automated Program Recognition: A Feasibility Demonstration , 1990, Artif. Intell..

[9]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1990, TOPL.

[10]  Pamela Samuelson Reverse-engineering someone else's software: is it legal? , 1990, IEEE Software.

[11]  Taghi M. Khoshgoftaar,et al.  Measurement of data structure complexity , 1993, J. Syst. Softw..

[12]  Ralph E. Johnson,et al.  Creating abstract superclasses by refactoring , 1993, CSC '93.

[13]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[14]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[15]  G. Ramalingam,et al.  The undecidability of aliasing , 1994, TOPL.

[16]  David Binkley,et al.  Unravel:: a case tool to assist evaluation of high integrity software , 1995 .

[17]  R. E. Kurt Stirewalt,et al.  The interleaving problem in program understanding , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[18]  Frank Tip,et al.  A survey of program slicing techniques , 1994, J. Program. Lang..

[19]  Cristina Cifuentes,et al.  Decompilation of binary programs , 1995, Softw. Pract. Exp..

[20]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .

[21]  Neil D. Jones,et al.  An introduction to partial evaluation , 1996, CSUR.

[22]  Bernhard Steffen,et al.  Parallelism for Free : E cient and Optimal Bitvector Analyses for Parallel Programs , 1996 .

[23]  Antero Taivalsaari,et al.  On the notion of inheritance , 1996, CSUR.

[24]  Craig Chambers,et al.  Whole-program optimization of object-oriented languages , 1996 .

[25]  Susan Horwitz,et al.  Precise flow-insensitive may-alias analysis is NP-hard , 1997, TOPL.

[26]  U. Wilhelm Cryptographically Protected Objects , 1997 .

[27]  Liwu Li,et al.  The Java Language , 1998 .

[28]  John Cowell The Java Language , 1999 .