From chaos to QoS: case studies in CMP resource management

As more and more cores are enabled on the die of future CMP platforms, we expect that several diverse workloads will run simultaneously on the platform. A key example of this trend is the growth of virtualization usage models. When multiple virtual machines or applications or threads run simultaneously, the quality of service (QoS) that the platform provides to each individual thread is non-deterministic today. This occurs because the simultaneously running threads place very different demands on the shared resources (cache space, memory bandwidth, etc) in the platform and in most cases contend with each other. In this paper, we first present case studies that show how this results in non-deterministic performance. Unlike the compute resources managed through scheduling, platform resource allocation to individual threads cannot be controlled today. In order to provide better determinism and QoS, we then examine resource management mechanisms and present QoS-aware architectures and execution environments. The main contribution of this paper is the architecture feasibility analysis through prototypes that allow experimentation with QoS-Aware execution environments and architectural resources. We describe these QoS prototypes and then present preliminary case studies of multi-tasking and virtualization usage models sharing one critical CMP resource (last-level cache). We then demonstrate how proper management of the cache resource can provide service differentiation and deterministic performance behavior when running disparate workloads in future CMP platforms.

[1]  Won-Taek Lim,et al.  Architectural support for operating system-driven CMP cache management , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[2]  Ravi R. Iyer,et al.  CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.

[3]  Gil Neiger,et al.  Intel virtualization technology , 2005, Computer.

[4]  Kunle Olukotun,et al.  The case for a single-chip multiprocessor , 1996, ASPLOS VII.

[5]  Yan Solihin,et al.  Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.

[6]  Srihari Makineni,et al.  Communist, Utilitarian, and Capitalist cache policies on CMPs: Caches as a shared resource , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[7]  Yale N. Patt,et al.  Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[8]  Rich Uhlig Taking Virtualization Mainstream on IntelŴArchitecture Platforms , 2006 .

[9]  S. Kim,et al.  Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[10]  Robert P. Goldberg,et al.  Survey of virtual machine research , 1974, Computer.

[11]  Tal Garfinkel,et al.  Virtual machine monitors: current technology and future trends , 2005, Computer.

[12]  Jeanna Neefe Matthews,et al.  Performance Isolation of a Misbehaving Virtual Machine with Xen , VMware and Solaris Containers , 2006 .

[13]  Richard Uhlig,et al.  SoftSDV: A Presilicon Software Development Environment for the IA-64 Architecture , 1999 .

[14]  Changkyu Kim,et al.  Nonuniform Cache Architectures for Wire-Delay Dominated On-Chip Caches , 2003, IEEE Micro.