MultiNyx: a multi-level abstraction framework for systematic analysis of hypervisors

MultiNyx is a new framework designed to systematically analyze modern virtual machine monitors (VMMs), which rely on complex processor extensions to enhance their efficiency. To achieve better scalability, MultiNyx introduces selective, multi-level symbolic execution: it analyzes most instructions at a high semantic level, and leverages an executable specification (e.g., the Bochs CPU emulator) to analyze complex instructions at a low semantic level. MultiNyx seamlessly transitions between these different semantic levels of analysis by converting their state. Our experiments demonstrate that MultiNyx is practical and effective at analyzing VMMs. By applying MultiNyx to KVM, we automatically generated 206,628 test cases. We found that many of these test cases revealed inconsistent results that could have security implications. In particular, 98 test cases revealed different results across KVM configurations running on the Intel architecture, and 641 produced different results across architectures (Intel and AMD). We reported some of these inconsistencies to the KVM developers, one of which already has been patched.

[1]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[2]  Thomas Santen,et al.  Verifying the Microsoft Hyper-V Hypervisor with VCC , 2009, FM.

[3]  Andrew Baumann Hardware is the new Software , 2017, HotOS.

[4]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[5]  Muli Ben-Yehuda,et al.  The Turtles Project: Design and Implementation of Nested Virtualization , 2010, OSDI.

[6]  Lorenzo Martignoni,et al.  Testing CPU emulators , 2009, ISSTA.

[7]  Dan Tsafrir,et al.  Hardware and Software Support for Virtualization , 2017, Synthesis Lectures on Computer Architecture.

[8]  George Candea,et al.  Parallel symbolic execution for automated real-world software testing , 2011, EuroSys '11.

[9]  Dan Tsafrir,et al.  Virtual CPU validation , 2015, SOSP.

[10]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[11]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[12]  Sagar Chaki,et al.  überSpark: Enforcing Verifiable Object Abstractions for Automated Compositional Security Analysis of a Hypervisor , 2016, USENIX Security Symposium.

[13]  James Newsome,et al.  Design, Implementation and Verification of an eXtensible and Modular Hypervisor Framework , 2013, 2013 IEEE Symposium on Security and Privacy.

[14]  Mark A. Hillebrand,et al.  Automated Verification of a Small Hypervisor , 2010, VSTTE.

[15]  Peter Sewell,et al.  A Better x86 Memory Model: x86-TSO (Extended Version) , 2009 .

[16]  Srinivas Devadas,et al.  Intel SGX Explained , 2016, IACR Cryptol. ePrint Arch..

[17]  Lorenzo Martignoni,et al.  Testing system virtual machines , 2010, ISSTA '10.

[18]  Patrice Godefroid,et al.  Automated Whitebox Fuzz Testing , 2008, NDSS.

[19]  Wolfgang J. Paul,et al.  Theory of Multi Core Hypervisor Verification , 2013, SOFSEM.

[20]  Dan Grossman,et al.  Symbolic execution of multithreaded programs from arbitrary program contexts , 2014, OOPSLA 2014.

[21]  George Candea,et al.  S2E: a platform for in-vivo multi-path analysis of software systems , 2011, ASPLOS XVI.

[22]  Stephen McCamant,et al.  Path-exploration lifting: hi-fi tests for lo-fi emulators , 2012, ASPLOS XVII.

[23]  George Candea,et al.  Prototyping symbolic execution engines for interpreted languages , 2014, ASPLOS.