Modeling Cross-Architecture Co-Tenancy Performance Interference

Cloud computing has become a dominant computing paradigm to provide elastic, affordable computing resources to end users. Due to the increased computing power of modern machines powered by multi/many-core computing, data centers often co-locate multiple virtual machines (VMs) into one physical machine, resulting in co-tenancy, and resource sharing and competition. Applications or VMs co-locating in one physical machine can interfere with each other despite of the promise of performance isolation through virtualization. Modelling and predicting co-run interference therefore becomes critical for data center job scheduling and QoS (Quality of Service) assurance. Co-run interference can be categorized into two metrics, sensitivity and pressure, where the former denotes how an application's performance is affected by its co-run applications, and the latter measures how it impacts the performance of its co-run applications. This paper shows that sensitivity and pressure are both application-and architecture dependent. Further, we propose a regression model that predicts an application's sensitivity and pressure across architectures with high accuracy. This regression model enables a data center scheduler to guarantee the QoS of a VM/application when it is scheduled to co-locate with another VMs/applications.

[1]  Zhao Zhang,et al.  Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[2]  James E. Smith,et al.  Fair Queuing Memory Systems , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[3]  Chen Ding,et al.  Linear-time Modeling of Program Working Set in Shared Cache , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[4]  G. Edward Suh,et al.  Analytical cache models with applications to cache partitioning , 2001, ICS '01.

[5]  Yan Solihin,et al.  Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[6]  John Turek,et al.  Optimal Partitioning of Cache Memory , 1992, IEEE Trans. Computers.

[7]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[8]  David E. Culler,et al.  Hierarchical scheduling for diverse datacenter workloads , 2013, SoCC.

[9]  Gabriel H. Loh,et al.  PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches , 2009, ISCA '09.

[10]  Lingjia Tang,et al.  Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers , 2013, ISCA.

[11]  Michael Stumm,et al.  Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[12]  Chen Ding,et al.  All-window profiling and composable models of cache sharing , 2011, PPoPP '11.

[13]  Derek Chiou Extending the reach of microprocessors: column and curious caching , 1999 .

[14]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[15]  Mohamed Z. Ghanem,et al.  Dynamic Partitioning of the Main Memory Using the Working Set Concept , 1975, IBM J. Res. Dev..

[16]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[17]  Yan Solihin,et al.  Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.

[18]  Srihari Makineni,et al.  Communist, Utilitarian, and Capitalist cache policies on CMPs: Caches as a shared resource , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[19]  Mung Chiang,et al.  Multiresource Allocation: Fairness–Efficiency Tradeoffs in a Unifying Framework , 2012, IEEE/ACM Transactions on Networking.

[20]  G. Edward Suh,et al.  A new memory monitoring scheme for memory-aware scheduling and partitioning , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[21]  Yan Solihin,et al.  QoS policies and architecture for cache/memory in CMP platforms , 2007, SIGMETRICS '07.

[22]  Babak Falsafi,et al.  Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.

[23]  Christian Bienia,et al.  Benchmarking modern multiprocessors , 2011 .

[24]  Ravi R. Iyer,et al.  CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.

[25]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[26]  Christina Delimitrou,et al.  QoS-Aware scheduling in heterogeneous datacenters with paragon , 2013, TOCS.

[27]  Yale N. Patt,et al.  Utility-Based Cache Partitioning , 2006 .

[28]  Xiaobing Feng,et al.  An empirical model for predicting cross-core performance interference on multicore processors , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.

[29]  James E. Smith,et al.  Virtual private caches , 2007, ISCA '07.