论文信息 - A complete formal semantics of x86-64 user-level instruction set architecture

A complete formal semantics of x86-64 user-level instruction set architecture

We present the most complete and thoroughly tested formal semantics of x86-64 to date. Our semantics faithfully formalizes all the non-deprecated, sequential user-level instructions of the x86-64 Haswell instruction set architecture. This totals 3155 instruction variants, corresponding to 774 mnemonics. The semantics is fully executable and has been tested against more than 7,000 instruction-level test cases and the GCC torture test suite. This extensive testing paid off, revealing bugs in both the x86-64 reference manual and other existing semantics. We also illustrate potential applications of our semantics in different formal analyses, and discuss how it can be useful for processor verification.

Grigore Rosu | Sandeep Dasgupta | Vikram S. Adve | Daejun Park | Theodoros Kasampalis

[1] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[2] Niranjan Hasabnis,et al. Extracting instruction semantics via symbolic execution of code generators , 2016, SIGSOFT FSE.

[3] Matt Kaufmann,et al. Engineering a Formal, Executable x86 ISA Simulator for Software Verification , 2017, Provably Correct Systems.

[4] Chucky Ellison,et al. An executable formal semantics of C with applications , 2011, POPL '12.

[5] Shaked Flur,et al. Simplifying ARM concurrency: multicopy-atomic axiomatic and operational models for ARMv8 , 2017, Proc. ACM Program. Lang..

[6] Zhong Shao,et al. CertiKOS: An Extensible Architecture for Building Certified Concurrent OS Kernels , 2016, OSDI.

[7] Xavier Leroy,et al. Formal verification of a realistic compiler , 2009, CACM.

[8] Joseph Tassarotti,et al. RockSalt: better, faster, stronger SFI for the x86 , 2012, PLDI.

[9] Panagiotis Manolios,et al. Computer-Aided Reasoning: An Approach , 2011 .

[10] Thomas W. Reps,et al. WYSINWYX: What You See Is Not What You eXecute , 2005, VSTTE.

[11] Alastair David Reid,et al. Trustworthy specifications of ARM® v8-A and v8-M system level architecture , 2016, 2016 Formal Methods in Computer-Aided Design (FMCAD).

[12] Shobha Vasudevan,et al. Efficient validation input generation in RTL by hybridized source code analysis , 2011, 2011 Design, Automation & Test in Europe.

[13] Matt Kaufmann,et al. Simulation and formal verification of x86 machine-code programs that make system calls , 2014, 2014 Formal Methods in Computer-Aided Design (FMCAD).

[14] Thomas W. Reps,et al. Synthesis of machine code from semantics , 2015, PLDI.

[15] Grigore Rosu,et al. Semantics-based program verifiers for all languages , 2016, OOPSLA.

[16] Stephen McCamant,et al. Path-exploration lifting: hi-fi tests for lo-fi emulators , 2012, ASPLOS XVII.

[17] Eddie Kohler,et al. Making information flow explicit in HiStar , 2006, OSDI '06.

[18] David Brumley,et al. BAP: A Binary Analysis Platform , 2011, CAV.

[19] Tom Ridge,et al. The semantics of x86-CC multiprocessor machine code , 2009, POPL '09.

[20] Kevin P. Lawton. Bochs: A Portable PC Emulator for Unix/X , 1996 .

[21] Shobha Vasudevan,et al. Scaling Input Stimulus Generation through Hybrid Static and Dynamic Analysis of RTL , 2014, TODE.

[22] Robert M. Norton,et al. ISA semantics for ARMv8-a, RISC-v, and CHERI-MIPS , 2019, Proc. ACM Program. Lang..

[23] Prabhat Mishra,et al. Directed test generation using concolic testing on RTL models , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[24] Binoy Ravindran,et al. Formally verified big step semantics out of x86-64 binaries , 2019, CPP.

[25] Grigore Rosu,et al. Checking reachability using matching logic , 2012, OOPSLA '12.

[26] Thomas W. Reps,et al. TSL: A System for Generating Abstract Interpreters and its Application to Machine-Code Analysis , 2013, TOPL.

[27] Niranjan Hasabnis,et al. Lifting Assembly to Intermediate Representation: A Novel Approach Leveraging Compilers , 2016 .

[28] Alexander Aiken,et al. Stochastic superoptimization , 2012, ASPLOS '13.

[29] Peter Sewell,et al. A Better x86 Memory Model: x86-TSO , 2009, TPHOLs.

[30] Harry D. Foster. Trends in functional verification: A 2014 industry study , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[31] Chucky Ellison,et al. The K Primer (version 3.3) , 2011, K.

[32] Dawson R. Engler,et al. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[33] Yi Zhang,et al. A formal verification tool for Ethereum VM bytecode , 2018, ESEC/SIGSOFT FSE.

[34] K. Thompson. Reflections on trusting trust , 1984, CACM.

[35] Fabrice Bellard,et al. QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX Annual Technical Conference, FREENIX Track.

[36] Grigore Rosu,et al. An overview of the K semantic framework , 2010, J. Log. Algebraic Methods Program..

[37] Xi Wang,et al. An Empirical Study on the Correctness of Formally Verified Distributed Systems , 2017, EuroSys.

[38] Alexander Aiken,et al. Stratified synthesis: automatically learning the x86-64 instruction set , 2016, PLDI.

[39] Christopher Krügel,et al. SOK: (State of) The Art of War: Offensive Techniques in Binary Analysis , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[40] Nicholas Nethercote,et al. Valgrind: A Program Supervision Framework , 2003, RV@CAV.