Adaptive mode control: a static-power-efficient cache design

Lower threshold voltages in deep sub-micron technologies cause store leakage current, increasing static power dissipation. This trend, combined with the trend of larger/more cache memories dominating die area, has prompted circuit designers to develop SRAM cells with low-leakage operating modes (e.g., sleep mode). Sleep mode reduces static power dissipation but data stored in a sleeping cell is unreliable or lost. So, at the architecture level, there is interest in exploiting sleep mode to reduce static power dissipation while maintaining high performance. Current approaches dynamically control the operating mode of large groups of cache lines or even individual cache lines. However, the performance monitoring mechanism that controls the percentage of sleep-mode lines, and identifies particular lines for sleep mode, is somewhat arbitrary. There is no way to know what the performance could be with all cache lines active, so arbitrary miss rate targets are set (perhaps on a per-benchmark basis using profile information) and the control mechanism tracks these targets. We propose applying sleep mode only to the data store and not the tag store. By keeping the entire tag store active, the hardware knows what the hypothetical miss rate would be if all data lines were active and the actual miss rate can be made to precisely track it. Simulations show an average of 73% of I-cache lines and 54% of D-cache lines are put in sleep mode with an average IPC impact of only 1.7%, for 64KB caches.

[1]  Mark Horowitz,et al.  Energy dissipation in general purpose microprocessors , 1996, IEEE J. Solid State Circuits.

[2]  Ibrahim N. Hajj,et al.  Architectural and compiler techniques for energy reduction in high-performance microprocessors , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[3]  Mahmut T. Kandemir,et al.  Energy-driven integrated hardware-software optimizations using SimplePower , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[4]  Vivek De,et al.  A new technique for standby leakage reduction in high-performance circuits , 1998, 1998 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.98CH36215).

[5]  Kaushik Roy,et al.  An integrated circuit/architecture approach to reducing leakage in deep-submicron high-performance I-caches , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[6]  T. Fujita,et al.  A 0.9 V 150 MHz 10 mW 4 mm/sup 2/ 2-D discrete cosine transform core processor with variable-threshold-voltage scheme , 1996, 1996 IEEE International Solid-State Circuits Conference. Digest of TEchnical Papers, ISSCC.

[7]  William H. Mangione-Smith,et al.  The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[8]  Norman P. Jouppi,et al.  An Integrated Cache Timing and Power Model , 2002 .

[9]  Kanad Ghose,et al.  Analytical energy dissipation models for low-power caches , 1997, ISLPED '97.

[10]  Koji Nii,et al.  A low power SRAM using auto-backgate-controlled MT-CMOS , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[11]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[12]  U. Ko,et al.  Characterization and design of a low-power, high-performance cache architecture , 1995, 1995 International Symposium on VLSI Technology, Systems, and Applications. Proceedings of Technical Papers.

[13]  G. Tyson,et al.  Eager writeback-a technique for improving bandwidth utilization , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.

[14]  Satoshi Shigematsu,et al.  A 1-V high-speed MTCMOS circuit scheme for power-down application circuits , 1997, IEEE J. Solid State Circuits.

[15]  T. Kuroda A 0.9V 150MHz 10mW 4mm^2 2-D Discrete Cosine Transform Core Processor with Variable-Threshold-Voltage Scheme , 1996 .

[16]  Shekhar Y. Borkar,et al.  Design challenges of technology scaling , 1999, IEEE Micro.

[17]  Martin Margala Low-power SRAM circuit design , 1999, Records of the 1999 IEEE International Workshop on Memory Technology, Design and Testing.

[18]  Margaret Martonosi,et al.  Cache decay: exploiting generational behavior to reduce cache leakage power , 2001, ISCA 2001.

[19]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[20]  Richard T. Witek,et al.  A 160 MHz 32 b 0.5 W CMOS RISC microprocessor , 1996, 1996 IEEE International Solid-State Circuits Conference. Digest of TEchnical Papers, ISSCC.

[21]  Kaushik Roy,et al.  Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories , 2000, ISLPED '00.

[22]  Stefanos Kaxiras,et al.  Cache-Line Decay: A Mechanism to Reduce Cache Leakage Power , 2000, PACS.