IPC-Based Cache Partitioning: An IPC-Oriented Dynamic Shared Cache Partitioning Mechanism

In a chip-multiprocessor with a shared cache structure, the last level cache is shared by multiple applications executing simultaneously. The competing accesses from different applications degrade the system performance, resulting in non-predicting executing time. Cache partitioning techniques partition the shared cache for multiple applications. The aim of traditional cache partitioning mechanism, Utility-based Cache Partition (UCP) for example, is to lower the overall miss rate of shared cache. But the lowest miss rate doesn't mean the highest performance (IPC). This paper investigates IPC-based Cache Partitioning (IPC-CP), an IPC performance oriented dynamic cache partitioning method. We design a Miss Rate Monitor to collect miss rate information of competing applications at run-time. Then the information collected is inputted to a Miss-Rate to IPC model to get the corresponding IPC performance. Lastly, we get the optimal cache partitioning based on IPC optimum objective function. Our evaluation, on top of a four cores CMP processor with 20 multi-programmed workloads shows that IPC-CP improves throughput by up to 53% and on average 9% over UCP.

[1]  Derek Chiou Extending the reach of microprocessors: column and curious caching , 1999 .

[2]  John Turek,et al.  Optimal Partitioning of Cache Memory , 1992, IEEE Trans. Computers.

[3]  Onur Mutlu,et al.  A Case for MLP-Aware Cache Replacement , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[4]  Richard E. Matick,et al.  Analytical analysis of finite cache penalty and cycles per instruction of a multiprocessor memory hierarchy using miss rates and queuing theory , 2001, IBM J. Res. Dev..

[5]  Aamer Jaleel,et al.  Last level cache (LLC) performance of data mining workloads on a CMP - a case study of parallel bioinformatics workloads , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[6]  Kunle Olukotun,et al.  Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.

[7]  G. Edward Suh,et al.  A new memory monitoring scheme for memory-aware scheduling and partitioning , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[8]  Balaram Sinharoy,et al.  POWER5 system microarchitecture , 2005, IBM J. Res. Dev..

[9]  Peter J. Denning,et al.  Resource allocation in multiprocess computer systems , 1968 .

[10]  Per Stenström,et al.  A Cache-Partitioning Aware Replacement Policy for Chip Multiprocessors , 2006, HiPC.

[11]  G. Edward Suh,et al.  Dynamic Partitioning of Shared Cache Memory , 2004, The Journal of Supercomputing.

[12]  Yale N. Patt,et al.  Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[13]  Won-Taek Lim,et al.  Architectural support for operating system-driven CMP cache management , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[14]  Norman P. Jouppi,et al.  Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .