Quality of service shared cache management in chip multiprocessor architecture

The trends in enterprise IT toward service-oriented computing, server consolidation, and virtual computing point to a future in which workloads are becoming increasingly diverse in terms of performance, reliability, and availability requirements. It can be expected that more and more applications with diverse requirements will run on a Chip Multi-Processor (CMP) and share platform resources such as the lowest level cache and off-chip bandwidth. In this environment, it is desirable to have microarchitecture and software support that can provide a guarantee of a certain level of performance, which we refer to as performance Quality of Service. In this article, we investigated a framework would be needed to manage the shared cache resource for fully providing QoS in a CMP. We found in order to fully provide QoS, we need to specify an appropriate QoS target for each job and apply an admission control policy to accept jobs only when their QoS targets can be satisfied. We also found that providing strict QoS often leads to a significant reduction in throughput due to resource fragmentation. We proposed throughput optimization techniques that include: (1) exploiting various QoS execution modes, and (2) a microarchitecture technique, which we refer to as resource stealing, that detects and reallocates excess cache capacity from a job while preserving its QoS target. We designed and evaluated three algorithms for performing resource stealing, which differ in how aggressive they are in stealing excess cache capacity, and in the degree of confidence in meeting QoS targets. In addition, we proposed a mechanism to dynamically enable or disable resource stealing depending on whether other jobs can benefit from additional cache capacity. We evaluated our QoS framework with a full system simulation of a 4-core CMP and a recent version of the Linux Operating System. We found that compared to an unoptimized scheme, the throughput can be improved by up to 47%, making the throughput significantly closer to a non-QoS CMP.

[1]  Glenn Reinman,et al.  Fast and fair: data-stream quality of service , 2005, CASES '05.

[2]  Jichuan Chang,et al.  Cooperative cache partitioning for chip multiprocessors , 2007, ICS '07.

[3]  Yan Solihin,et al.  QoS policies and architecture for cache/memory in CMP platforms , 2007, SIGMETRICS '07.

[4]  Giuseppe Lipari,et al.  Elastic Scheduling for Flexible Workload Management , 2002, IEEE Trans. Computers.

[5]  Zhao Zhang,et al.  Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[6]  Sangyeun Cho,et al.  Managing Distributed, Shared L2 Caches through OS-Level Page Allocation , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[7]  Daniel P. Siewiorek,et al.  A resource allocation model for QoS management , 1997, Proceedings Real-Time Systems Symposium.

[8]  Mike P. Papazoglou,et al.  Introduction: Service-oriented computing , 2003, CACM.

[9]  Yan Solihin,et al.  Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.

[10]  Srihari Makineni,et al.  Communist, Utilitarian, and Capitalist cache policies on CMPs: Caches as a shared resource , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[11]  James H. Anderson,et al.  Integrating Hard/Soft Real-Time Tasks and Best-Effort Jobs on Multiprocessors , 2007, 19th Euromicro Conference on Real-Time Systems (ECRTS'07).

[12]  K. M. Ahmed,et al.  LSBATCH: A Distributed Load Sharing Batch System , 1993 .

[13]  Yan Solihin,et al.  A Framework for Providing Quality of Service in Chip Multi-Processors , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[14]  Yannis Smaragdakis,et al.  Adaptive Caches: Effective Shaping of Cache Behavior to Workloads , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[15]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[16]  Pradeep Dubey,et al.  Platform 2015: Intel ® Processor and Platform Evolution for the Next Decade , 2005 .

[17]  James E. Smith,et al.  Virtual private caches , 2007, ISCA '07.

[18]  S. Kim,et al.  Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[19]  Shuichi Oikawa,et al.  Resource kernels: a resource-centric approach to real-time and multimedia systems , 2001, Electronic Imaging.

[20]  Krste Asanovic,et al.  METERG: Measurement-Based End-to-End Performance Estimation Technique in QoS-Capable Multiprocessors , 2006, 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'06).

[21]  G. Edward Suh,et al.  Analytical cache models with applications to cache partitioning , 2001, ICS '01.

[22]  Mike P. Papazoglou,et al.  Service oriented computing : Introduction , 2003 .

[23]  Philip G. Emma,et al.  Understanding some simple processor-performance limits , 1997, IBM J. Res. Dev..

[24]  Yong Luo,et al.  Development and validation of a hierarchical memory model incorporating CPU- and memory-operation overlap model , 1998, WOSP '98.

[25]  Ravi R. Iyer,et al.  CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.

[26]  Won-Taek Lim,et al.  Architectural support for operating system-driven CMP cache management , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[27]  Onur Mutlu,et al.  A Case for MLP-Aware Cache Replacement , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[28]  Yale N. Patt,et al.  Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[29]  Norman P. Jouppi,et al.  Enterprise IT trends and implications for architecture research , 2005, 11th International Symposium on High-Performance Computer Architecture.