Shared Resource Monitoring and Throughput Optimization in Cloud-Computing Datacenters

Many data centers employ server consolidation to maximize the efficiency of platform resource usage. As a result, multiple virtual machines (VMs) simultaneously run on each data center platform. Contention for shared resources between these virtual machines has an undesirable and non-deterministic impact on their performance behavior in such platforms. This paper proposes the use of shared resource monitoring to (a) understand the resource usage of each virtual machine on each platform, (b) collect resource usage and performance across different platforms to correlate implications of usage to performance, and (c) migrate VMs that are resource-constrained to improve overall data center throughput and improve Quality of Service (QoS). We focus our efforts on monitoring and addressing shared cache contention and propose a new optimization metric that captures the priority of the VM and the overall weighted throughput of the data center. We conduct detailed experiments emulating data center scenarios including on-line transaction processing workloads (based on TPC-C) middle-tier workloads (based on SPECjbb and SPECjAppServer) and financial workloads (based on PARSEC). We show that monitoring shared resource contention (such as shared cache) is highly beneficial to better manage throughput and QoS in a cloud-computing data center environment.

[1]  Shlomit S. Pinter,et al.  Data Sharing Conscious Scheduling for Multi-threaded Applications on SMP Machines , 2006, Euro-Par.

[2]  Won-Taek Lim,et al.  Architectural support for operating system-driven CMP cache management , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[3]  Natalie D. Enright Jerger,et al.  An Evaluation of Server Consolidation Workloads for Multi-Core Designs , 2007, 2007 IEEE 10th International Symposium on Workload Characterization.

[4]  Yale N. Patt,et al.  Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[5]  Srihari Makineni,et al.  Communist, Utilitarian, and Capitalist cache policies on CMPs: Caches as a shared resource , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[6]  Yan Solihin,et al.  QoS policies and architecture for cache/memory in CMP platforms , 2007, SIGMETRICS '07.

[7]  Aamer Jaleel,et al.  CMPSched$im: Evaluating OS/CMP interaction on shared cache management , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[8]  Ravi R. Iyer,et al.  CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.

[9]  Fang Liu,et al.  Understanding how off-chip memory bandwidth partitioning in Chip Multiprocessors affects system performance , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[10]  Zhen Fang,et al.  ACCESS: Smart scheduling for asymmetric cache CMPs , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[11]  G. Clark,et al.  Reference , 2008 .

[12]  Yan Solihin,et al.  CHOP: Adaptive filter-based DRAM caching for CMP server platforms , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[13]  Yan Solihin,et al.  CHOP: Integrating DRAM Caches for CMP Server Platforms , 2011, IEEE Micro.

[14]  Mahmut T. Kandemir,et al.  Organizing the last line of defense before hitting the memory wall for CMPs , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).

[15]  Ludmila Cherkasova,et al.  Measuring CPU Overhead for I/O Processing in the Xen Virtual Machine Monitor , 2005, USENIX ATC, General Track.

[16]  Li Zhao,et al.  VM3: Measuring, modeling and managing VM shared resources , 2009, Comput. Networks.

[17]  Paula Smith,et al.  VMmark: A Scalable Benchmark for Virtualized Systems , 2006 .

[18]  Michael Stumm,et al.  RapidMRC: approximating L2 miss rate curves on commodity systems for online optimizations , 2009, ASPLOS.

[19]  Christoforos E. Kozyrakis,et al.  From chaos to QoS: case studies in CMP resource management , 2007, CARN.

[20]  Jeffrey Casazza,et al.  Redefining Server Performance Characterization for Virtualization Benchmarking , 2006 .

[21]  Erik Hagersten,et al.  STATSHARE: A Statistical Model for Managing Cache Sharing via Decay , 2006 .

[22]  Erik Hagersten,et al.  Modeling Cache Sharing on Chip Multiprocessor Architectures , 2006, 2006 IEEE International Symposium on Workload Characterization.