Code obfuscation against symbolic execution attacks

Code obfuscation is widely used by software developers to protect intellectual property, and malware writers to hamper program analysis. However, there seems to be little work on systematic evaluations of effectiveness of obfuscation techniques against automated program analysis. The result is that we have no methodical way of knowing what kinds of automated analyses an obfuscation method can withstand. This paper addresses the problem of characterizing the resilience of code obfuscation transformations against automated symbolic execution attacks, complementing existing works that measure the potency of obfuscation transformations against human-assisted attacks through user studies. We evaluated our approach over 5000 different C programs, which have each been obfuscated using existing implementations of obfuscation transformations. The results show that many existing obfuscation transformations, such as virtualization, stand little chance of withstanding symbolic-execution based deobfuscation. A crucial and perhaps surprising observation we make is that symbolic-execution based deobfuscators can easily deobfuscate transformations that preserve program semantics. On the other hand, we present new obfuscation transformations that change program behavior in subtle yet acceptable ways, and show that they can render symbolic-execution based deobfuscation analysis ineffective in practice.

[1]  Lynn E. Garner On the Collatz $3n+1$ algorithm , 1981 .

[2]  Yoann Guillot,et al.  Automatic binary deobfuscation , 2009, Journal in Computer Virology.

[3]  Koen De Bosschere,et al.  Program obfuscation: a quantitative approach , 2007, QoP '07.

[4]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[5]  Clark Thomborson,et al.  Manufacturing cheap, resilient, and stealthy opaque constructs , 1998, POPL '98.

[6]  R. Sekar,et al.  On the Limits of Information Flow Techniques for Malware Analysis and Containment , 2008, DIMVA.

[7]  Pascal Junod,et al.  Obfuscator-LLVM -- Software Protection for the Masses , 2015, 2015 IEEE/ACM 1st International Workshop on Software Protection.

[8]  Ilya Mironov,et al.  Applications of SAT Solvers to Cryptanalysis of Hash Functions , 2006, SAT.

[9]  Marco Torchiano,et al.  A family of experiments to assess the effectiveness and efficiency of source code obfuscation techniques , 2013, Empirical Software Engineering.

[10]  Marco Torchiano,et al.  The effectiveness of source code obfuscation: An experimental assessment , 2009, 2009 IEEE 17th International Conference on Program Comprehension.

[11]  Michael Franz,et al.  E unibus pluram: massive-scale software diversity as a defense mechanism , 2010, NSPW '10.

[12]  Debin Gao,et al.  Linear Obfuscation to Combat Symbolic Execution , 2011, ESORICS.

[13]  Zhenkai Liang,et al.  BitBlaze: A New Approach to Computer Security via Binary Analysis , 2008, ICISS.

[14]  Saumya Debray,et al.  A Generic Approach to Automatic Deobfuscation of Executable Code , 2015, 2015 IEEE Symposium on Security and Privacy.

[15]  Nasir D. Memon,et al.  Preventing Piracy, Reverse Engineering, and Tampering , 2003, Computer.

[16]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[17]  Patrice Godefroid,et al.  Automated Whitebox Fuzz Testing , 2008, NDSS.

[18]  David H. Ackley,et al.  Building diverse computer systems , 1997, Proceedings. The Sixth Workshop on Hot Topics in Operating Systems (Cat. No.97TB100133).

[19]  Toby Walsh,et al.  The Constrainedness of Search , 1996, AAAI/IAAI, Vol. 1.

[20]  Alexander Pretschner,et al.  Software-Based Protection against Changeware , 2015, CODASPY.

[21]  Cesare Tinelli,et al.  Satisfiability Modulo Theories , 2021, Handbook of Satisfiability.

[22]  Jonathon T. Giffin,et al.  Automatic Reverse Engineering of Malware Emulators , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[23]  Nikolai Tillmann,et al.  Demand-Driven Compositional Symbolic Execution , 2008, TACAS.

[24]  David L. Dill,et al.  A Decision Procedure for Bit-Vectors and Arrays , 2007, CAV.

[25]  Amit Sahai,et al.  On the (im)possibility of obfuscating programs , 2001, JACM.

[26]  Saumya Debray,et al.  Symbolic Execution of Obfuscated Code , 2015, CCS.

[27]  David Aucsmith,et al.  Tamper Resistant Software: An Implementation , 1996, Information Hiding.

[28]  Stefan Katzenbeisser,et al.  Protecting Software through Obfuscation , 2016, ACM Comput. Surv..

[29]  Christian S. Collberg,et al.  Distributed application tamper detection via continuous software updates , 2012, ACSAC '12.

[30]  M. Preda Code Obfuscation and Malware Detection by Abstract Interpretation , 2007 .

[31]  Saumya K. Debray,et al.  Deobfuscation: reverse engineering obfuscated code , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[32]  Myra B. Cohen,et al.  An orchestrated survey of methodologies for automated software test case generation , 2013, J. Syst. Softw..

[33]  Johannes Kinder Towards Static Analysis of Virtualization-Obfuscated Binaries , 2012, 2012 19th Working Conference on Reverse Engineering.

[34]  David Brumley,et al.  All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask) , 2010, 2010 IEEE Symposium on Security and Privacy.

[35]  Dawson R. Engler,et al.  EXE: automatically generating inputs of death , 2006, CCS '06.

[36]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[37]  Christopher Krügel,et al.  Firmalice - Automatic Detection of Authentication Bypass Vulnerabilities in Binary Firmware , 2015, NDSS.

[38]  Mikhail J. Atallah,et al.  Protecting Software Code by Guards , 2001, Digital Rights Management Workshop.

[39]  Cataldo Basile,et al.  Towards a Formal Model for Software Tamper Resistance , 2009 .

[40]  Alexander Pretschner,et al.  A Framework for Measuring Software Obfuscation Resilience against Automated Attacks , 2015, 2015 IEEE/ACM 1st International Workshop on Software Protection.

[41]  David Brumley,et al.  Automatic exploit generation , 2014, CACM.

[42]  Patrice Godefroid,et al.  SAGE: Whitebox Fuzzing for Security Testing , 2012, ACM Queue.

[43]  Brent Waters,et al.  Candidate Indistinguishability Obfuscation and Functional Encryption for all Circuits , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[44]  Xiaohong Su,et al.  Identifying and Understanding Self-Checksumming Defenses in Software , 2015, CODASPY.

[45]  Ran Canetti,et al.  Studies in program obfuscation , 2010 .

[46]  Guofei Gu,et al.  TaintScope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[47]  Chris Eagle,et al.  The IDA Pro Book: The Unofficial Guide to the World's Most Popular Disassembler , 2008 .

[48]  Christian S. Collberg,et al.  A Taxonomy of Obfuscating Transformations , 1997 .

[49]  Kevin Coogan,et al.  Deobfuscation of virtualization-obfuscated software: a semantics-based approach , 2011, CCS '11.

[50]  Robert E. Tarjan,et al.  Dynamic Self-Checking Techniques for Improved Tamper Resistance , 2001, Digital Rights Management Workshop.

[51]  Roberto Giacobazzi,et al.  Control code obfuscation by abstract interpretation , 2005, Third IEEE International Conference on Software Engineering and Formal Methods (SEFM'05).

[52]  Jonathon T. Giffin,et al.  Impeding Malware Analysis Using Conditional Code Obfuscation , 2008, NDSS.

[53]  George Candea,et al.  S2E: a platform for in-vivo multi-path analysis of software systems , 2011, ASPLOS XVI.