Superset Disassembly: Statically Rewriting x86 Binaries Without Heuristics

Static binary rewriting is a core technology for many systems and security applications, including profiling, optimization, and software fault isolation. While many static binary rewriters have been developed over the past few decades, most make various assumptions about the binary, such as requiring correct disassembly, cooperation from compilers, or access to debugging symbols or relocation entries. This paper presents MULTIVERSE, a new binary rewriter that is able to rewrite Intel CISC binaries without these assumptions. Two fundamental techniques are developed to achieve this: (1) a superset disassembly that completely disassembles the binary code into a superset of instructions in which all legal instructions fall, and (2) an instruction rewriter that is able to relocate all instructions to any other location by mediating all indirect control flow transfers and redirecting them to the correct new addresses. A prototype implementation of MULTIVERSE and evaluation on SPECint 2006 benchmarks shows that MULTIVERSE is able to rewrite all of the testing binaries with a reasonable runtime overhead for the new rewritten binaries. Simple static instrumentation using MULTIVERSE and its comparison with dynamic instrumentation shows that the approach achieves better average performance. Finally, the security applications of MULTIVERSE are exhibited by using it to implement a shadow stack.

[1]  Sotiris Ioannidis,et al.  GPU-Disasm: A GPU-Based X86 Disassembler , 2015, ISC.

[2]  Xiangyu Zhang,et al.  Obfuscation resilient binary code reuse through trace-oriented programming , 2013, CCS.

[3]  Mingwei Zhang,et al.  Control Flow Integrity for COTS Binaries , 2013, USENIX Security Symposium.

[4]  Jeffrey R. Forshaw,et al.  The quantum universe : everything that can happen does happen , 2012 .

[5]  Zhiqiang Lin,et al.  Type Inference on Executables , 2016, ACM Comput. Surv..

[6]  Robert Wahbe,et al.  Efficient software-based fault isolation , 1994, SOSP '93.

[7]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[8]  Rajeev Barua,et al.  A compiler-level intermediate representation based binary analysis and rewriting system , 2013, EuroSys '13.

[9]  James S. Trefil The Quantum Universe (And Why Anything That Can Happen Does) , 2012 .

[10]  Per Larsen,et al.  Opaque Control-Flow Integrity , 2015, NDSS.

[11]  Mihai Budiu,et al.  Control-flow integrity principles, implementations, and applications , 2009, TSEC.

[12]  Amitabh Srivastava,et al.  Vulcan Binary transformation in a distributed environment , 2001 .

[13]  Angelos D. Keromytis,et al.  Retrofitting Security in COTS Software with Binary Rewriting , 2011, SEC.

[14]  Murat Kantarcioglu,et al.  Shingled Graph Disassembly: Finding the Undecideable Path , 2014, PAKDD.

[15]  Fred C. Chow,et al.  Engineering a RISC Compiler System , 1986, COMPCON.

[16]  K. De Bosschere,et al.  DIABLO: a reliable, retargetable and extensible link-time rewriting framework , 2005, Proceedings of the Fifth IEEE International Symposium on Signal Processing and Information Technology, 2005..

[17]  Xi Chen,et al.  A Tough Call: Mitigating Advanced Code-Reuse Attacks at the Binary Level , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[18]  Xi Chen,et al.  CodeArmor: Virtualizing the Code Space to Counter Disclosure Attacks , 2017, 2017 IEEE European Symposium on Security and Privacy (EuroS&P).

[19]  James R. Larus,et al.  EEL: machine-independent executable editing , 1995, PLDI '95.

[20]  Barton P. Miller,et al.  Anywhere, any-time binary instrumentation , 2011, PASTE '11.

[21]  Dinghao Wu,et al.  UROBOROS: Instrumenting Stripped Binaries with Static Reassembling , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[22]  Martín Abadi,et al.  XFI: software guards for system address spaces , 2006, OSDI '06.

[23]  Chao Zhang,et al.  Practical Control Flow Integrity and Randomization for Binary Executables , 2013, 2013 IEEE Symposium on Security and Privacy.

[24]  Neha Narula,et al.  Native Client: A Sandbox for Portable, Untrusted x86 Native Code , 2009, IEEE Symposium on Security and Privacy.

[25]  James R. Larus,et al.  Rewriting executable files to measure program behavior , 1994, Softw. Pract. Exp..

[26]  Karthikeyan Sankaralingam,et al.  Power struggles: Revisiting the RISC vs. CISC debate on contemporary ARM and x86 architectures , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[27]  Herbert Bos,et al.  Practical Context-Sensitive CFI , 2015, CCS.

[28]  Peter Deutsch,et al.  A Flexible Measurement Tool for Software Systems , 1971, IFIP Congress.

[29]  Kevin W. Hamlen,et al.  Object Flow Integrity , 2017, CCS.

[30]  Alec Wolman,et al.  Instrumentation and optimization of Win32/intel executables using Etch , 1997 .

[31]  Úlfar Erlingsson,et al.  SASI enforcement of security policies: a retrospective , 1999, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[32]  Tzi-cker Chiueh,et al.  BIRD: binary interpretation using runtime disassembly , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[33]  Xiangyu Zhang,et al.  BISTRO: Binary Component Extraction and Embedding for Software Security Applications , 2013, ESORICS.

[34]  Stephen McCamant,et al.  Binary Code Extraction and Interface Identification for Security Applications , 2009, NDSS.

[35]  Kevin W. Hamlen,et al.  Securing untrusted code via compiler-agnostic binary rewriting , 2012, ACSAC '12.

[36]  Rajeev Barua,et al.  Static binary rewriting without supplemental information: Overcoming the tradeoff between coverage and correctness , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[37]  Christopher Krügel,et al.  Static Disassembly of Obfuscated Binaries , 2004, USENIX Security Symposium.

[38]  Gregory R. Andrews,et al.  PLTO: A Link-Time Optimizer for the Intel IA-32 Architecture , 2007 .

[39]  Richard L. Sites,et al.  Binary translation , 1993, CACM.

[40]  Christopher Krügel,et al.  Ramblr: Making Reassembly Great Again , 2017, NDSS.

[41]  Christopher Krügel,et al.  How the ELF Ruined Christmas , 2015, USENIX Security Symposium.

[42]  Mingwei Zhang,et al.  A platform for secure static binary instrumentation , 2014, VEE '14.

[43]  Christopher Krügel,et al.  Firmalice - Automatic Detection of Authentication Bypass Vulnerabilities in Binary Firmware , 2015, NDSS.

[44]  Xi Chen,et al.  StackArmor: Comprehensive Protection From Stack-based Memory Error Vulnerabilities for Binaries , 2015, NDSS.

[45]  Stephen McCamant,et al.  Evaluating SFI for a CISC Architecture , 2006, USENIX Security Symposium.

[46]  Dinghao Wu,et al.  Reassembleable Disassembling , 2015, USENIX Security Symposium.

[47]  Michael Laurenzano,et al.  PEBIL: Efficient static binary instrumentation for Linux , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

[48]  David A. Wagner,et al.  The Performance Cost of Shadow Stacks and Stack Canaries , 2015, AsiaCCS.

[49]  Kevin W. Hamlen,et al.  Binary stirring: self-randomizing instruction addresses of legacy x86 binary code , 2012, CCS.

[50]  Zhiqiang Lin,et al.  PT-CFI: Transparent Backward-Edge Control Flow Violation Detection Using Intel Processor Trace , 2017, CODASPY.