SEEAD: A Semantic-Based Approach for Automatic Binary Code De-obfuscation

Increasingly sophisticated code obfuscation techniques are quickly adopted by malware developers to escape from malware detection and to thwart the reverse engineering effort of security analysts. State-of-the-art de-obfuscation approaches rely on dynamic analysis, but face the challenge of low code coverage as not all software execution paths and behavior will be exposed at specific profiling runs. As a result, these approaches often fail to discover hidden malicious patterns. This paper introduces SEEAD, a novel and generic semantic-based de-obfuscation system. When building SEEAD, we try to rely on as few assumptions about the structure of the obfuscation tool as possible, so that the system can keep pace with the fast evolving code obfuscation techniques. To increase the code coverage, SEEAD dynamically directs the target program to execute different paths across different runs. This dynamic profiling scheme is rife with taint and control dependence analysis to reduce the search overhead, and a carefully designed protection scheme to bring the program to an error free status should any error happens during dynamic profile runs. As a result, the increased code coverage enables us to uncover hidden malicious behaviors that are not detected by traditional dynamic analysis based de-obfuscation approaches. We evaluate SEEAD on a range of benign and malicious obfuscated programs. Our experimental results show that SEEAD is able to successfully recover the original logic from obfuscated binaries.

[1]  Christian S. Collberg,et al.  Surreptitious Software - Obfuscation, Watermarking, and Tamperproofing for Software Protection , 2009, Addison-Wesley Software Security Series.

[2]  Rolf Rolles,et al.  Unpacking Virtualization Obfuscators , 2009, WOOT.

[3]  Wang Huai-ju Research on Deformation Based Binary Code Obfuscation Technology , 2014 .

[4]  Heng Yin,et al.  Renovo: a hidden code extractor for packed executables , 2007, WORM '07.

[5]  Ralph Langner,et al.  Stuxnet: Dissecting a Cyberwarfare Weapon , 2011, IEEE Security & Privacy.

[6]  Wei Liu,et al.  PathExpander: Architectural Support for Increasing the Path Coverage of Dynamic Bug Detection , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[7]  Shinichiro Yamamoto,et al.  A CASE tool platform using an XML representation of Java source code , 2004, Source Code Analysis and Manipulation, Fourth IEEE International Workshop on.

[8]  Xiangyu Zhang,et al.  Locating faults through automated predicate switching , 2006, ICSE.

[9]  Arun Lakhotia,et al.  Abstracting stack to detect obfuscated calls in binaries , 2004, Source Code Analysis and Manipulation, Fourth IEEE International Workshop on.

[10]  Somesh Jha,et al.  Static Analysis of Executables to Detect Malicious Patterns , 2003, USENIX Security Symposium.

[11]  Saumya K. Debray,et al.  Deobfuscation: reverse engineering obfuscated code , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[12]  Jonathon T. Giffin,et al.  Automatic Reverse Engineering of Malware Emulators , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[13]  P. Tonella,et al.  Adding distribution to existing applications by means of aspect oriented programming , 2004 .

[14]  Jack W. Davidson,et al.  Protection of software-based survivability mechanisms , 2001, 2001 International Conference on Dependable Systems and Networks.

[15]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[16]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[17]  George Candea,et al.  S2E: a platform for in-vivo multi-path analysis of software systems , 2011, ASPLOS XVI.

[18]  Xiangyu Zhang,et al.  Obfuscation resilient binary code reuse through trace-oriented programming , 2013, CCS.

[19]  Angelos Stavrou,et al.  Forced-Path Execution for Android Applications on x86 Platforms , 2013, 2013 IEEE Seventh International Conference on Software Security and Reliability Companion.

[20]  Fei Peng,et al.  X-Force: Force-Executing Binary Programs for Security Applications , 2014, USENIX Security Symposium.

[21]  Angelos Stavrou,et al.  Exposing Security Risks for Commercial Mobile Devices , 2012, MMM-ACNS.

[22]  Christian S. Collberg,et al.  A Taxonomy of Obfuscating Transformations , 1997 .

[23]  Kevin Coogan,et al.  Deobfuscation of virtualization-obfuscated software: a semantics-based approach , 2011, CCS '11.

[24]  David Zhang,et al.  Secure program execution via dynamic information flow tracking , 2004, ASPLOS XI.

[25]  Peter Sestoft,et al.  Partial evaluation and automatic program generation , 1993, Prentice Hall international series in computer science.

[26]  R. Sekar,et al.  Efficient fine-grained binary instrumentationwith applications to taint-tracking , 2008, CGO '08.

[27]  Christopher Krügel,et al.  Exploring Multiple Execution Paths for Malware Analysis , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).