vIOMMU: Efficient IOMMU Emulation

Direct device assignment, where a guest virtual machine directly interacts with an I/O device without host intervention, is appealing, because it allows an unmodified (non-hypervisor-aware) guest to achieve near-native performance. But device assignment for unmodified guests suffers from two serious deficiencies: (1) it requires pinning all of the guest's pages, thereby disallowing memory overcommitment, and (2) it exposes the guest's memory to buggy device drivers. We solve these problems by designing, implementing, and exposing an emulated IOMMU (vIOMMU) to the unmodified guest. We employ two novel optimizations to make vIOMMU perform well: (1) waiting a few milliseconds before tearing down an IOMMU mapping in the hope it will be immediately reused ("optimistic teardown"), and (2) running the vIOMMU on a sidecore, and thereby enabling for the first time the use of a sidecore by unmodified guests. Both optimizations are highly effective in isolation. The former allows bare-metal to achieve 100% of a 10Gbps line rate. The combination of the two allows an unmodified guest to do the same.

[1]  Ole Agesen,et al.  A comparison of software and hardware techniques for x86 virtualization , 2006, ASPLOS XII.

[2]  Ivan B. Ganev,et al.  Re-architecting VMMs for Multicore Systems : The Sidecore Approach , 2007 .

[3]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX Annual Technical Conference, FREENIX Track.

[4]  Nectarios Koziris,et al.  Facilitating efficient synchronization of asymmetric threads on hyper-threaded processors , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[5]  Jimi Xenidis,et al.  Utilizing IOMMUs for Virtualization in Linux and Xen Muli , 2006 .

[6]  Muli Ben-Yehuda,et al.  IOMMU: strategies for mitigating the IOTLB bottleneck , 2010, ISCA'10.

[7]  Muli Ben-Yehuda,et al.  On the DMA mapping problem in direct device assignment , 2010, SYSTOR '10.

[8]  Peter Desnoyers,et al.  Memory buddies: exploiting page sharing for smart colocation in virtualized data centers , 2009, VEE '09.

[9]  Adrian Perrig,et al.  SecVisor: a tiny hypervisor to provide lifetime kernel code integrity for commodity OSes , 2007, SOSP.

[10]  Muli Ben-Yehuda,et al.  SplitX: Split Guest/Hypervisor Execution on Multi-Core , 2011, WIOV.

[11]  Dhabaleswar K. Panda,et al.  High Performance VMM-Bypass I/O in Virtual Machines , 2006, USENIX Annual Technical Conference, General Track.

[12]  Herbert Bos,et al.  Failure Resilience for Device Drivers , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[13]  Carl A. Waldspurger,et al.  Memory resource management in VMware ESX server , 2002, OSDI '02.

[14]  Norman P. Jouppi,et al.  Single-ISA heterogeneous multi-core architectures for multithreaded workload performance , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[15]  Alan L. Cox,et al.  Concurrent Direct Network Access for Virtual Machine Monitors , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[16]  Stefan Götz,et al.  Unmodified Device Driver Reuse and Improved System Dependability via Virtual Machines , 2004, OSDI.

[17]  Rusty Russell,et al.  virtio: towards a de-facto standard for virtual I/O devices , 2008, OPSR.

[18]  Beng-Hong Lim,et al.  Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor , 2001, USENIX Annual Technical Conference, General Track.

[19]  Sriram K. Rajamani,et al.  Thorough static analysis of device drivers , 2006, EuroSys.

[20]  Willy Zwaenepoel,et al.  Diagnosing performance overheads in the xen virtual machine environment , 2005, VEE '05.

[21]  Rafal Wojtczuk Subverting the Xen hypervisor , 2008 .

[22]  Joe Grand,et al.  A hardware-based memory acquisition procedure for digital investigations , 2004, Digit. Investig..

[23]  Adrian Schüpbach,et al.  The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.

[24]  Brian N. Bershad,et al.  Improving the reliability of commodity operating systems , 2005, TOCS.

[25]  A. Kivity,et al.  kvm : the Linux Virtual Machine Monitor , 2007 .

[26]  George Varghese,et al.  Difference engine , 2010, OSDI.

[27]  Emin Gün Sirer,et al.  Device Driver Safety Through a Reference Validation Mechanism , 2008, OSDI.

[28]  Asim Kadav,et al.  Live migration of direct-access devices , 2008, OPSR.

[29]  Vishakha Gupta,et al.  High-Performance Hypervisor Architectures: Virtualization in HPC Systems , 2007 .

[30]  Alan L. Cox,et al.  Protection Strategies for Direct Access to Virtualized I/O Devices , 2008, USENIX Annual Technical Conference.

[31]  Jiuxing Liu Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[32]  D. Feitelson,et al.  General-Purpose Timing : The Failure of Periodic Timers , .

[33]  Adrian Schüpbach,et al.  Your computer is already a distributed system. Why isn't your OS? , 2009, HotOS.

[34]  Muli Ben-Yehuda,et al.  The Turtles Project: Design and Implementation of Nested Virtualization , 2010, OSDI.

[35]  S. Hand,et al.  Live Migration with Pass-through Device for Linux VM , 2010 .

[36]  Muli Ben-Yehuda,et al.  IsoStack - Highly Efficient Network Processing on Dedicated Cores , 2010, USENIX Annual Technical Conference.

[37]  Matthias S. Müller,et al.  Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[38]  Jiuxing Liu,et al.  Virtualization polling engine (VPE): using dedicated CPU cores to accelerate I/O virtualization , 2009, ICS.

[39]  Gil Neiger,et al.  Intel ® Virtualization Technology for Directed I/O , 2006 .

[40]  Muli Ben-Yehuda,et al.  Direct Device Assignment for Untrusted Fully-Virtualized Virtual Machines , 2008 .

[41]  Yingwei Luo,et al.  Selective hardware/software memory virtualization , 2011, VEE '11.

[42]  Tomer Hertz,et al.  Portably Solving File TOCTTOU Races with Hardness Amplification , 2008, FAST.

[43]  Ben Pfaff Performance analysis of BSTs in system software , 2004, SIGMETRICS '04/Performance '04.

[44]  Muli Ben-Yehuda,et al.  The Price of Safety : Evaluating IOMMU Performance , 2007 .

[45]  Karsten Schwan,et al.  High performance and scalable I/O virtualization via self-virtualized devices , 2007, HPDC '07.