论文信息 - Modeling Cross-Architecture Co-Tenancy Performance Interference

Modeling Cross-Architecture Co-Tenancy Performance Interference

Cloud computing has become a dominant computing paradigm to provide elastic, affordable computing resources to end users. Due to the increased computing power of modern machines powered by multi/many-core computing, data centers often co-locate multiple virtual machines (VMs) into one physical machine, resulting in co-tenancy, and resource sharing and competition. Applications or VMs co-locating in one physical machine can interfere with each other despite of the promise of performance isolation through virtualization. Modelling and predicting co-run interference therefore becomes critical for data center job scheduling and QoS (Quality of Service) assurance. Co-run interference can be categorized into two metrics, sensitivity and pressure, where the former denotes how an application's performance is affected by its co-run applications, and the latter measures how it impacts the performance of its co-run applications. This paper shows that sensitivity and pressure are both application-and architecture dependent. Further, we propose a regression model that predicts an application's sensitivity and pressure across architectures with high accuracy. This regression model enables a data center scheduler to guarantee the QoS of a VM/application when it is scheduled to co-locate with another VMs/applications.

Laura E. Brown | Zhenlin Wang | Wei Kuang

[1] Zhao Zhang,et al. Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[2] James E. Smith,et al. Fair Queuing Memory Systems , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[3] Chen Ding,et al. Linear-time Modeling of Program Working Set in Shared Cache , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[4] G. Edward Suh,et al. Analytical cache models with applications to cache partitioning , 2001, ICS '01.

[5] Yan Solihin,et al. Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[6] John Turek,et al. Optimal Partitioning of Cache Memory , 1992, IEEE Trans. Computers.

[7] Alexandru Iosup,et al. A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[8] David E. Culler,et al. Hierarchical scheduling for diverse datacenter workloads , 2013, SoCC.

[9] Gabriel H. Loh,et al. PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches , 2009, ISCA '09.

[10] Lingjia Tang,et al. Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers , 2013, ISCA.

[11] Michael Stumm,et al. Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[12] Chen Ding,et al. All-window profiling and composable models of cache sharing , 2011, PPoPP '11.

[13] Derek Chiou. Extending the reach of microprocessors: column and curious caching , 1999 .

[14] David R. Anderson,et al. Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[15] Mohamed Z. Ghanem,et al. Dynamic Partitioning of the Main Memory Using the Working Set Concept , 1975, IBM J. Res. Dev..

[16] Christina Delimitrou,et al. Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[17] Yan Solihin,et al. Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.

[18] Srihari Makineni,et al. Communist, Utilitarian, and Capitalist cache policies on CMPs: Caches as a shared resource , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[19] Mung Chiang,et al. Multiresource Allocation: Fairness–Efficiency Tradeoffs in a Unifying Framework , 2012, IEEE/ACM Transactions on Networking.

[20] G. Edward Suh,et al. A new memory monitoring scheme for memory-aware scheduling and partitioning , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[21] Yan Solihin,et al. QoS policies and architecture for cache/memory in CMP platforms , 2007, SIGMETRICS '07.

[22] Babak Falsafi,et al. Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.

[23] Christian Bienia,et al. Benchmarking modern multiprocessors , 2011 .

[24] Ravi R. Iyer,et al. CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.

[25] Kevin Skadron,et al. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[26] Christina Delimitrou,et al. QoS-Aware scheduling in heterogeneous datacenters with paragon , 2013, TOCS.

[27] Yale N. Patt,et al. Utility-Based Cache Partitioning , 2006 .

[28] Xiaobing Feng,et al. An empirical model for predicting cross-core performance interference on multicore processors , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.

[29] James E. Smith,et al. Virtual private caches , 2007, ISCA '07.