A Micro-Architectural Power-Saving Technique for D-NUCA Caches

caches are cache memories that, thanks to banked organization, broadcast search and promotion/demotion mechanism, are able to tolerate the increasing wire delay effects introduced by technology scaling. As a consequence, they will outperform conventional caches (UCA, Uniform Cache Architectures) in future generation cores. Due to the promotion/demotion mechanism, we observed that the distribution of hits across the ways of a D-NUCA cache varies across applications as well as across different execution phases within a single application. In this work, we show how such a behavior can be leveraged to improve the D-NUCA power efficiency as well as to decrease its access latency. In particular, we propose: 1) A new micro architectural technique to reduce the static power consumption of a D-NUCA cache by dynamically adapting the number of active (i.e. powered-on) ways to the need of the running application; our initial evaluation shows that a strong reduction of the average number of active ways (36.9%) is achievable, without significantly affecting the IPC (-2.97%), leading to a resultant reduction of the Energy Delay Product (EDP) of 30.9%. 2) A strategy to tune the characteristic parameters of the proposed technique. 3) A variant of the technique which leads to a more aggressive power reduction strategy

[1]  Margaret Martonosi,et al.  Let caches decay: reducing leakage energy via exploitation of cache generational behavior , 2002, TOCS.

[2]  M. Horowitz,et al.  Low-power digital design , 1994, Proceedings of 1994 IEEE Symposium on Low Power Electronics.

[3]  Michel Dubois,et al.  Controlling leakage power with the replacement policy in slumberous caches , 2005, CF '05.

[4]  Zeshan Chishti,et al.  Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures , 2003, MICRO.

[5]  James E. Smith,et al.  Comparing Program Phase Detection Techniques , 2003, MICRO.

[6]  Vikas Agarwal,et al.  Static energy reduction techniques for microprocessor caches , 2003, IEEE Trans. Very Large Scale Integr. Syst..

[7]  Mahmut T. Kandemir,et al.  Leakage Current: Moore's Law Meets Static Power , 2003, Computer.

[8]  Rajeev Balasubramonian,et al.  Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures , 2000, MICRO 33.

[9]  Vikas Agarwal,et al.  Clock rate versus IPC: the end of the road for conventional microarchitectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[10]  David Blaauw,et al.  Drowsy caches: simple techniques for reducing leakage power , 2002, ISCA.

[11]  Changkyu Kim,et al.  Nonuniform Cache Architectures for Wire-Delay Dominated On-Chip Caches , 2003, IEEE Micro.

[12]  Hiroaki Kobayashi,et al.  Locality analysis to control dynamically way-adaptable caches , 2004, MEDEA '04.

[13]  T. N. Vijaykumar,et al.  Distance associativity for high-performance energy-efficient non-uniform cache architectures , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[14]  Brad Calder,et al.  Time Varying Behavior of Programs , 1999 .

[15]  Michael L. Scott,et al.  Integrating adaptive on-chip storage structures for reduced dynamic power , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.

[16]  Pierfrancesco Foglia,et al.  A cache design for high performance embedded systems , 2005, J. Embed. Comput..

[17]  Yan Meng,et al.  Exploring the limits of leakage power reduction in caches , 2005, TACO.

[18]  Doug Burger,et al.  An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.

[19]  Michael Franz,et al.  Power reduction techniques for microprocessor systems , 2005, CSUR.

[20]  Kaushik Roy,et al.  Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories , 2000, ISLPED '00.

[21]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[22]  Alessandro Bardine,et al.  Analysis of static and dynamic energy consumption in NUCA caches: initial results , 2007, MEDEA '07.