Supporting Overcommitted Virtual Machines through Hardware Spin Detection

Multiprocessor operating systems (OSs) pose several unique and conflicting challenges to System Virtual Machines (System VMs). For example, most existing system VMs resort to gang scheduling a guest OS's virtual processors (VCPUs) to avoid OS synchronization overhead. However, gang scheduling is infeasible for some application domains, and is inflexible in other domains. In an overcommitted environment, an individual guest OS has more VCPUs than available physical processors (PCPUs), precluding the use of gang scheduling. In such an environment, we demonstrate a more than two-fold increase in application runtime when transparently virtualizing a chip-multiprocessor's cores. To combat this problem, we propose a hardware technique to detect when a VCPU is wasting CPU cycles, and preempt that VCPU to run a different, more productive VCPU. Our technique can dramatically reduce cycles wasted on OS synchronization, without requiring any semantic information from the software. We then present a server consolidation case study to demonstrate the potential of more flexible scheduling policies enabled by our technique. We propose one such policy that logically partitions the CMP cores between guest VMs. This policy increases throughput by 10-25 percent for consolidated server workloads due to improved cache locality and core utilization.

[1]  Paul Barford,et al.  Generating representative Web workloads for network and server performance evaluation , 1998, SIGMETRICS '98/PERFORMANCE '98.

[2]  Koushik Chakraborty,et al.  Adapting to intermittent faults in multicore systems , 2008, ASPLOS.

[3]  Bryan S. Rosenburg Low-synchronization translation lookaside buffer consistency in large-scale shared-memory multiprocessors , 1989, SOSP '89.

[4]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[5]  Brian N. Bershad,et al.  Scheduler activations: effective kernel support for the user-level management of parallelism , 1991, TOCS.

[6]  Gil Neiger,et al.  Intel virtualization technology , 2005, Computer.

[7]  David Larson,et al.  Advanced virtualization capabilities of POWER5 systems , 2005, IBM J. Res. Dev..

[8]  Marianne Shaw,et al.  Scale and performance in the Denali isolation kernel , 2002, OSDI '02.

[9]  J.E. Smith,et al.  Achieving high performance via co-designed virtual machines , 1998, Innovative Architecture for Future Generation High-Performance Processors and Systems.

[10]  Sanjay J. Patel,et al.  Characterizing the effects of transient faults on a high-performance processor pipeline , 2004, International Conference on Dependable Systems and Networks, 2004.

[11]  David A. Wood,et al.  Variability in architectural simulations of multi-threaded workloads , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[12]  Koushik Chakraborty,et al.  Computation spreading: employing hardware migration to specialize CMP cores on-the-fly , 2006, ASPLOS XII.

[13]  Tong Li,et al.  Spin detection hardware for improved management of multithreaded systems , 2006, IEEE Transactions on Parallel and Distributed Systems.

[14]  James R. Goodman,et al.  Speculative lock elision: enabling highly concurrent multithreaded execution , 2001, MICRO.

[15]  Andrea C. Arpaci-Dusseau,et al.  Implicit coscheduling: coordinated scheduling with implicit information in distributed systems , 2001, TOCS.

[16]  Hermann Härtig,et al.  Pragmatic Nonblocking Synchronization for Real-Time Systems , 2001, USENIX Annual Technical Conference, General Track.

[17]  Mendel Rosenblum,et al.  Cellular disco: resource management using virtual clusters on shared-memory multiprocessors , 2000, TOCS.

[18]  Renato J. O. Figueiredo,et al.  Guest Editors' Introduction: Resource Virtualization Renaissance , 2005, Computer.

[19]  Koushik Chakraborty,et al.  Hardware support for spin management in overcommitted virtual machines , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[20]  Mikko H. Lipasti,et al.  Temporally silent stores , 2002, ASPLOS X.

[21]  John K. Ousterhout,et al.  Scheduling Techniques for Concurrent Systems , 1982, ICDCS.

[22]  T. N. Vijaykumar,et al.  Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.

[23]  Josep Torrellas,et al.  Evaluating the Performance of Cache-Affinity Scheduling in Shared-Memory Multiprocessors , 1995, J. Parallel Distributed Comput..

[24]  James R. Larus,et al.  Using Cohort-Scheduling to Enhance Server Performance , 2002, USENIX Annual Technical Conference, General Track.

[25]  Joshua LeVasseur,et al.  Towards Scalable Multiprocessor Virtual Machines , 2004, Virtual Machine Research and Technology Symposium.

[26]  John K. Ousterhout Scheduling Techniques for Concurrebt Systems. , 1982, ICDCS 1982.

[27]  John C. Thomas,et al.  Running Quake II on a grid , 2006, IBM Syst. J..

[28]  James E. Smith,et al.  Virtual machines - versatile platforms for systems and processes , 2005 .

[29]  VMware ESX Server 3 Ready Time Observations , 2004 .

[30]  Thomas F. Wenisch,et al.  Temporal Streaming of Shared Memory , 2005, ISCA 2005.

[31]  Gurindar S. Sohi,et al.  Multiscalar processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[32]  Irith Pomeranz,et al.  Transient-fault recovery for chip multiprocessors , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..

[33]  Carl A. Waldspurger,et al.  Memory resource management in VMware ESX server , 2002, OSDI '02.

[34]  Norman P. Jouppi,et al.  Single-ISA heterogeneous multi-core architectures for multithreaded workload performance , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..