Sound Transpilation from Binary to Machine-Independent Code

In order to handle the complexity and heterogeneity of modern instruction set architectures, analysis platforms share a common design, the adoption of hardware-independent intermediate representations. The usage of these platforms to verify systems down to binary-level is appealing due to the high degree of automation they provide. However, it introduces the need for trusting the correctness of the translation from binary code to intermediate language. Achieving a high degree of trust is challenging since this transpilation must handle (i) all the side effects of the instructions, (ii) multiple instruction encoding (e.g. ARM Thumb), and (iii) variable instruction length (e.g. Intel). We overcome these problems by formally modeling one of such intermediate languages in the interactive theorem prover HOL4 and by implementing a proof-producing transpiler. This tool translates ARMv8 programs to the intermediate language and generates a HOL4 proof that demonstrates the correctness of the translation in the form of a simulation theorem. We also show how the transpiler theorems can be used to transfer properties verified on the intermediate language to the binary code.

[1]  Michael Norrish,et al.  seL4: formal verification of an OS kernel , 2009, SOSP '09.

[2]  Zhenkai Liang,et al.  Jump-oriented programming: a new class of code-reuse attack , 2011, ASIACCS '11.

[3]  Roberto Guanciale,et al.  Machine code verification of a tiny ARM hypervisor , 2013, TrustED '13.

[4]  Xavier Leroy,et al.  Formal verification of a realistic compiler , 2009, CACM.

[5]  Ramana Kumar,et al.  CakeML: a verified implementation of ML , 2014, POPL.

[6]  Magnus O. Myreen,et al.  Proof Pearl: A Verified Bignum Implementation in x86-64 Machine Code , 2013, CPP.

[7]  Niranjan Hasabnis,et al.  Lifting Assembly to Intermediate Representation: A Novel Approach Leveraging Compilers , 2016 .

[8]  Magnus O. Myreen,et al.  Specification and Verification of ARM Hardware and Software , 2010, Design and Verification of Microprocessor Systems for High-Assurance Applications.

[9]  Hovav Shacham,et al.  The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86) , 2007, CCS '07.

[10]  Guillaume Melquiond,et al.  Floating-point arithmetic , 2023, Acta Numerica.

[11]  Magnus O. Myreen,et al.  Translation validation for a verified OS kernel , 2013, PLDI.

[12]  Anthony C. J. Fox Directions in ISA Specification , 2012, ITP.

[13]  David Brumley,et al.  BAP: A Binary Analysis Platform , 2011, CAV.

[14]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[15]  Konrad Slind,et al.  Machine-Code Verification for Multiple Architectures - An Application of Decompilation into Logic , 2008, 2008 Formal Methods in Computer-Aided Design.

[16]  Mark A. Hillebrand,et al.  The Verisoft Approach to Systems Verification , 2008, VSTTE.

[17]  Roberto Guanciale,et al.  Automating Information Flow Analysis of Low Level Code , 2014, CCS.

[18]  Zhenkai Liang,et al.  BitBlaze: A New Approach to Computer Security via Binary Analysis , 2008, ICISS.

[19]  Christopher Krügel,et al.  SOK: (State of) The Art of War: Offensive Techniques in Binary Analysis , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[20]  Nicholas Nethercote,et al.  Valgrind: A Program Supervision Framework , 2003, RV@CAV.

[21]  Guodong Li,et al.  Structure of a Proof-Producing Compiler for a Subset of Higher Order Logic , 2007, ESOP.

[22]  Sascha Böhme,et al.  Reconstruction of Z3's Bit-Vector Proofs in HOL4 and Isabelle/HOL , 2011, CPP.

[23]  Frank Piessens,et al.  A machine-checked soundness proof for an efficient verification condition generator , 2010, SAC '10.

[24]  Andrew W. Appel,et al.  Verified Correctness and Security of OpenSSL HMAC , 2015, USENIX Security Symposium.