vCache: Architectural support for transparent and isolated virtual LLCs in virtualized environments

A key role of virtualization is to give an illusion that a consolidated workload runs on a dedicated machine although the underlying resources are actively shared by multiple workloads. Technical advances have enabled a virtual machine (VM) to exercise many shared resources of a machine in a transparent and isolated manner. However, such an illusion of resource dedication has not been supported for the last-level cache (LLC), although the LLC is the largest on-chip shared resource with a significant performance impact. In this paper, we propose vCache-architectural support to provide a transparent and isolated virtual LLC (vLLC) for each VM and interfaces to manage the vLLC. More specifically, this study first proposes architectural support for the guest OS of a VM to index the LLC with its guest physical address instead of a host physical address. This in turn allows that the guest OS transparently view its vLLC and preserve the effectiveness of its page placement policy. Second, this study extends the architectural support for each VM to keep its vLLC strongly isolated from other VMs. Such resource dedication is critical to offer performance isolation and preserve vLLC transparency for each VM in a highly consolidated machine. With little hardware overhead, vCache can facilitate various unchartered vLLC capacity-based services for the public clouds while providing up to 17% higher performance than a traditional virtualized system.

[1]  Peter J. Varman,et al.  mClock: Handling Throughput Variability for Hypervisor IO Scheduling , 2010, OSDI.

[2]  Yale N. Patt,et al.  Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[3]  Rajeev Balasubramonian,et al.  Dynamic hardware-assisted software-controlled page placement to manage capacity allocation and sharing within large caches , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[4]  Carl A. Waldspurger,et al.  Memory resource management in VMware ESX server , 2002, OSDI '02.

[5]  Brian N. Bershad,et al.  Dynamic Page Mapping Policies for Cache Conflict Resolution on Standard Hardware , 1994, OSDI.

[6]  Xiaoning Ding,et al.  ULCC: a user-level facility for optimizing shared cache performance on multicores , 2011, PPoPP '11.

[7]  David K. Tam,et al.  Managing Shared L2 Caches on Multicore Systems in Software , 2007 .

[8]  Zhao Zhang,et al.  Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[9]  Michael Stumm,et al.  Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[10]  Brad Calder,et al.  Reducing cache misses using hardware and software page placement , 1999, ICS '99.

[11]  Sanjay Chaudhary,et al.  Application Performance Isolation in Virtualization , 2009, 2009 IEEE International Conference on Cloud Computing.

[12]  Robert P. Goldberg,et al.  Survey of virtual machine research , 1974, Computer.

[13]  Min Huang,et al.  An Energy Efficient 32-nm 20-MB Shared On-Die L3 Cache for Intel® Xeon® Processor E5 Family , 2013, IEEE Journal of Solid-State Circuits.

[14]  Todd C. Mowry,et al.  Compiler-directed page coloring for multiprocessors , 1996, ASPLOS VII.

[15]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[16]  Yan Solihin,et al.  Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[17]  Richard E. Kessler,et al.  Page placement algorithms for large real-indexed caches , 1992, TOCS.

[18]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[19]  James E. Smith,et al.  Virtual private caches , 2007, ISCA '07.

[20]  Jennifer Rexford,et al.  NoHype: virtualized cloud infrastructure without the virtualization , 2010, ISCA.

[21]  David A. Patterson,et al.  A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness , 2013, ISCA.

[22]  Scott Devine,et al.  Disco: running commodity operating systems on scalable multiprocessors , 1997, TOCS.

[23]  Uday Bondhugula,et al.  A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.

[24]  Jie Liu,et al.  Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines , 2011, SoCC.

[25]  Fabio Checconi,et al.  Providing Performance Guarantees to Virtual Machines Using Real-Time Scheduling , 2010, Euro-Par Workshops.

[26]  Vanish Talwar,et al.  Pegasus: Coordinated Scheduling for Virtualized Accelerator-based Systems , 2011, USENIX Annual Technical Conference.

[27]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[28]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[29]  Peter Davies,et al.  The TLB slice-a low-cost high-speed address translation mechanism , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[30]  G. Edward Suh,et al.  A new memory monitoring scheme for memory-aware scheduling and partitioning , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[31]  Xiao Zhang,et al.  Towards practical page coloring-based multicore cache management , 2009, EuroSys '09.

[32]  Norman P. Jouppi,et al.  CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.

[33]  Amin Vahdat,et al.  Enforcing Performance Isolation Across Virtual Machines in Xen , 2006, Middleware.