Selective cache ways: on-demand cache resource allocation

Increasing levels of microprocessor power dissipation call for new approaches at the architectural level that save energy by better matching of on-chip resources to application requirements. Selective cache ways provides the ability to disable a subset of the ways in a set associative cache during periods of modest cache activity, while the full cache may remain operational for more cache-intensive periods. Because this approach leverages the subarray partitioning that is already present for performance reasons, only minor changes to a conventional cache are required and therefore, full-speed cache operation can be maintained. Furthermore, the tradeoff between performance and energy is flexible, and can be dynamically tailored to meet changing application and machine environmental conditions. We show that trading off a small performance degradation for energy savings can produce a significant reduction in cache energy dissipation using this approach.

[1]  David H. Albonesi,et al.  Methodology for the analysis of dynamic application parallelism and its application to reconfigurable computing , 1999, Optics East.

[2]  Richard E. Kessler,et al.  The Alpha 21264 microprocessor , 1999, IEEE Micro.

[3]  Ibrahim N. Hajj,et al.  Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[4]  Norman P. Jouppi,et al.  WRL Research Report 93/5: An Enhanced Access and Cycle Time Model for On-chip Caches , 1994 .

[5]  William H. Mangione-Smith,et al.  The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[6]  Jeffrey Dean,et al.  ProfileMe: hardware support for instruction-level profiling on out-of-order processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[7]  Lance M. Berc,et al.  Continuous profiling: where have all the cycles gone? , 1997, ACM Trans. Comput. Syst..

[8]  Anantha P. Chandrakasan,et al.  Low-power CMOS digital design , 1992 .

[9]  Soha Hassoun,et al.  A 200-MHz 64-bit Dual-Issue CMOS Microprocessor , 1992, Digit. Tech. J..

[10]  William J. Bowhill,et al.  Circuit Implementation of a 300-MHz 64-bit Second-generation CMOS Alpha CPU , 1995, Digit. Tech. J..

[11]  Zheng Wang,et al.  System support for automatic profiling and optimization , 1997, SOSP.

[12]  Timothy G. Mattson,et al.  The Performance of the Intel TFLOPS Supercomputer , 1998 .

[13]  Kanad Ghose,et al.  Analytical energy dissipation models for low-power caches , 1997, ISLPED '97.

[14]  Richard E. Kessler,et al.  The Alpha 21264 microprocessor architecture , 1998, Proceedings International Conference on Computer Design. VLSI in Computers and Processors (Cat. No.98CB36273).

[15]  T. Wada,et al.  An analytical access time model for on-chip cache memories , 1992 .

[16]  Edward McLellan The Alpha AXP Architecture and 21064 , 1993 .

[17]  David H. Albonesi Dynamic IPC/clock rate optimization , 1998, ISCA.

[18]  David A. Wood,et al.  Cache profiling and the SPEC benchmarks: a case study , 1994, Computer.

[19]  James R. Larus,et al.  Exploiting hardware performance counters with flow and context sensitive profiling , 1997, PLDI '97.

[20]  Mark Horowitz,et al.  Cache performance of operating system and multiprogramming workloads , 1988, TOCS.

[21]  Thorsten von Eicken,et al.  技術解説 IEEE Computer , 1999 .

[22]  Peter A. Dinda,et al.  The CMU task parallel program suite , 1994 .

[23]  Hewlett-Packard THE HP PA-8000 RISC CPU , 2022 .

[24]  M. Tremblay,et al.  UltraSparc I: a four-issue processor supporting multimedia , 1996, IEEE Micro.

[25]  Gary Lauterbach,et al.  UltraSPARC-III: designing third-generation 64-bit performance , 1999, IEEE Micro.

[26]  K JainAnil,et al.  Internal organization of the Alpha 21164, a 300-MHz 64-bit quad-issue CMOS RISC microprocessor , 1995 .

[27]  Lance M. Berc,et al.  Continuous profiling: where have all the cycles gone? , 1997, TOCS.

[28]  Edward McLellan The Alpha AXP architecture and 21064 processor , 1993, IEEE Micro.

[29]  Richard T. Witek,et al.  A 160 MHz 32 b 0.5 W CMOS RISC microprocessor , 1996, 1996 IEEE International Solid-State Circuits Conference. Digest of TEchnical Papers, ISSCC.