Speculative disassembly of binary code

Embedded software is rapidly increasing in complexity. To cope with this, developers rely on third-party IPs to accelerate product delivery. However, IP source code might not be available which limits verifiability. This creates a particular challenge especially in safety-critical applications, e.g., automotive. Static Binary Analysis (SBA) is a promising technique to address such a challenge by providing engineers with the ability to reason about the actual instructions executed for all possible inputs. Disassembly is the fundamental first step for any SBA where assembly instructions are recovered from binary code. Correct disassembly, however, is challenging since data is mixed with code in binaries. Moreover, variable-size ISA, e.g., Thumb and TriCore, allow a single byte sequence to have multiple valid interpretations. We introduce Spedi, an open source SPEculative DIsassembler for Thumb ISA. Spedi is based on a principled approach to disassembly where all possible basic blocks are speculatively recovered. Then, basic blocks are refined using conflict analyses to identify assembly instructions. Experiments using a wide range of benchmarks demonstrate that Spedi is both fast and effective. It outperforms IDA Pro, the de-facto industry standard disassembler, in terms of disassembly correctness. Spedi can also recover the majority of the call graph and switch table targets. It is resilient to obfuscation and doesn't use any symbol information which makes it a suitable front-end for a wide variety of SBA applications including security analysis.

[1]  Thomas W. Reps,et al.  Directed Proof Generation for Machine Code , 2010, CAV.

[2]  Hovav Shacham,et al.  Comprehensive Experimental Analyses of Automotive Attack Surfaces , 2011, USENIX Security Symposium.

[3]  Henrik Theiling,et al.  Extracting safe and precise control flow from binaries , 2000, Proceedings Seventh International Conference on Real-Time Computing Systems and Applications.

[4]  Thomas W. Reps,et al.  WYSINWYX: What You See Is Not What You eXecute , 2005, VSTTE.

[5]  Barton P. Miller,et al.  Learning to Analyze Binary Computer Code , 2008, AAAI.

[6]  Barton P. Miller,et al.  Practical analysis of stripped binary code , 2005, CARN.

[7]  Murat Kantarcioglu,et al.  Shingled Graph Disassembly: Finding the Undecideable Path , 2014, PAKDD.

[8]  Helmut Veith,et al.  An Abstract Interpretation-Based Framework for Control Flow Reconstruction from Binaries , 2008, VMCAI.

[9]  Alan J. Hu,et al.  Embedded Software Verification Using Symbolic Execution and Uninterpreted Functions , 2006, International Journal of Parallel Programming.

[10]  Christopher Krügel,et al.  Limits of Static Analysis for Malware Detection , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[11]  Xuejun Yang,et al.  Finding and understanding bugs in C compilers , 2011, PLDI '11.

[12]  R. Nigel Horspool,et al.  An Approach to the Problem of Detranslation of Computer Programs , 1980, Comput. J..

[13]  Zhendong Su,et al.  Compiler validation via equivalence modulo inputs , 2014, PLDI.

[14]  David Brumley,et al.  BYTEWEIGHT: Learning to Recognize Functions in Binary Code , 2014, USENIX Security Symposium.

[15]  Christopher Krügel,et al.  Static Disassembly of Obfuscated Binaries , 2004, USENIX Security Symposium.

[16]  Mingwei Zhang,et al.  Control Flow Integrity for COTS Binaries , 2013, USENIX Security Symposium.

[17]  Saumya K. Debray,et al.  Obfuscation of executable code to improve resistance to static disassembly , 2003, CCS '03.

[18]  Dominik Stoffel,et al.  An equivalence checker for hardware-dependent embedded system software , 2013, 2013 Eleventh ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMOCODE 2013).

[19]  Rajeev Barua,et al.  A compiler-level intermediate representation based binary analysis and rewriting system , 2013, EuroSys '13.

[20]  Cristina Cifuentes,et al.  Recovery of jump table case statements from binary code , 1999, Proceedings Seventh International Workshop on Program Comprehension.

[21]  Bastian Schlich,et al.  Model checking of software for microcontrollers , 2010, TECS.

[22]  Giovanni Vigna,et al.  Static Detection of Vulnerabilities in x86 Executables , 2006, 2006 22nd Annual Computer Security Applications Conference (ACSAC'06).