Tesseract: reconciling guest I/O and hypervisor swapping in a VM

Double-paging is an often-cited, if unsubstantiated, problem in multi-level scheduling of memory between virtual machines (VMs) and the hypervisor. This problem occurs when both a virtualized guest and the hypervisor overcommit their respective physical address-spaces. When the guest pages out memory previously swapped out by the hypervisor, it initiates an expensive sequence of steps causing the contents to be read in from the hypervisor swapfile only to be written out again, significantly lengthening the time to complete the guest I/O request. As a result, performance rapidly drops. We present Tesseract, a system that directly and transparently addresses the double-paging problem. Tesseract tracks when guest and hypervisor I/O operations are redundant and modifies these I/Os to create indirections to existing disk blocks containing the page contents. Although our focus is on reconciling I/Os between the guest disks and hypervisor swap, our technique is general and can reconcile, or deduplicate, I/Os for guest pages read or written by the VM. Deduplication of disk blocks for file contents accessed in a common manner is well-understood. One challenge that our approach faces is that the locality of guest I/Os (reflecting the guest's notion of disk layout) often differs from that of the blocks in the hypervisor swap. This loss of locality through indirection results in significant performance loss on subsequent guest reads. We propose two alternatives to recovering this lost locality, each based on the idea of asynchronously reorganizing the indirected blocks in persistent storage. We evaluate our system and show that it can significantly reduce the costs of double-paging. We focus our experiments on a synthetic benchmark designed to highlight its effects. In our experiments we observe Tesseract can improve our benchmark's throughput by as much as 200% when using traditional disks and by as much as 30% when using SSD. At the same time worst case application responsiveness can be improved by a factor of 5.