On the DMA mapping problem in direct device assignment

I/O intensive workloads running in virtual machines can suffer massive performance degradation. Direct assignment of I/O devices to virtual machines is the best performing I/O virtualization mechanism, but its performance still remains far from the bare-metal (non-virtualized) case. The primary gap between direct assignment I/O performance and bare-metal I/O performance is the overhead of mapping the VM's memory pages for DMA in IOMMU translation tables. One could avoid this overhead by mapping all of the VM's pages for the lifetime of the VM, but this leads to memory consumption which is unacceptable in many scenarios. The DMA mapping problem can be stated briefly as "when should a memory page be mapped or unmapped for DMA?" We begin by presenting a theoretical framework for reasoning about the DMA mapping problem. Then, using a quota-based approach, we propose the on-demand DMA mapping strategy, which provides the best DMA mapping performance for a given amount of memory consumed. In particular, on-demand mapping can achieve the same performance as state-of-the-art mapping strategies while consuming much less memory (exact amount depends on the workload's requirements). We present the design and implementation of on-demand mapping in the Linux-based KVM hypervisor and an experimental evaluation of its application to various workloads.

[1]  Jose Renato Santos,et al.  Bridging the Gap between Software and Hardware Techniques for I/O Virtualization , 2008, USENIX Annual Technical Conference.

[2]  Alan L. Cox,et al.  Concurrent Direct Network Access for Virtual Machine Monitors , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[3]  Muli Ben-Yehuda,et al.  IsoStack - Highly Efficient Network Processing on Dedicated Cores , 2010, USENIX Annual Technical Conference.

[4]  Xiaoxin Chen,et al.  Overshadow: a virtualization-based approach to retrofitting protection in commodity operating systems , 2008, ASPLOS.

[5]  Beng-Hong Lim,et al.  Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor , 2001, USENIX Annual Technical Conference, General Track.

[6]  Alan L. Cox,et al.  Protection Strategies for Direct Access to Virtualized I/O Devices , 2008, USENIX Annual Technical Conference.

[7]  Dhabaleswar K. Panda,et al.  High Performance VMM-Bypass I/O in Virtual Machines , 2006, USENIX Annual Technical Conference, General Track.

[8]  Carl A. Waldspurger,et al.  Memory resource management in VMware ESX server , 2002, OSDI '02.

[9]  Karsten Schwan,et al.  High performance and scalable I/O virtualization via self-virtualized devices , 2007, HPDC '07.

[10]  Zhao Yu,et al.  SR-IOV Networking in Xen: Architecture, Design and Implementation , 2008, Workshop on I/O Virtualization.

[11]  Allan Borodin,et al.  Online computation and competitive analysis , 1998 .

[12]  George Varghese,et al.  Difference engine , 2010, OSDI.

[13]  Laszlo A. Belady,et al.  A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..

[14]  J. Carter,et al.  A Comparison of Software and Hardware , 1996 .

[15]  Gil Neiger,et al.  Intel ® Virtualization Technology for Directed I/O , 2006 .

[16]  Alan L. Cox,et al.  Achieving 10 Gb/s using safe and transparent network interface virtualization , 2009, VEE '09.

[17]  Andrew Warfield,et al.  Safe Hardware Access with the Xen Virtual Machine Monitor , 2007 .

[18]  Jiesheng Wu,et al.  Memory registration caching correctness , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[19]  Stefan Götz,et al.  Unmodified Device Driver Reuse and Improved System Dependability via Virtual Machines , 2004, OSDI.

[20]  Rusty Russell,et al.  virtio: towards a de-facto standard for virtual I/O devices , 2008, OPSR.

[21]  Gerald J. Popek,et al.  Formal requirements for virtualizable third generation architectures , 1974, SOSP '73.

[22]  Jimi Xenidis,et al.  Utilizing IOMMUs for Virtualization in Linux and Xen Muli , 2006 .

[23]  A. Kivity,et al.  kvm : the Linux Virtual Machine Monitor , 2007 .

[24]  Amos Fiat,et al.  Experimental Studies of Access Graph Based Heuristics: Beating the LRU Standard? , 1997, SODA.

[25]  Muli Ben-Yehuda,et al.  Scalable I/O - A Well-Architected Way to Do Scalable, Secure and Virtualized I/O , 2008, Workshop on I/O Virtualization.

[26]  Ole Agesen,et al.  A comparison of software and hardware techniques for x86 virtualization , 2006, ASPLOS XII.

[27]  Chris Mason,et al.  Paravirtualized Paging , 2008, Workshop on I/O Virtualization.

[28]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[29]  Allan Borodin,et al.  Competitive paging with locality of reference , 1991, STOC '91.

[30]  Jiuxing Liu,et al.  Virtualization polling engine (VPE): using dedicated CPU cores to accelerate I/O virtualization , 2009, ICS.

[31]  Srilatha Manne,et al.  Accelerating two-dimensional page walks for virtualized systems , 2008, ASPLOS.

[32]  Muli Ben-Yehuda,et al.  Direct Device Assignment for Untrusted Fully-Virtualized Virtual Machines , 2008 .

[33]  Jiuxing Liu Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).