Design space exploration of workload-specific last-level caches

Leakage power of last-level caches constitute a significant part of overall power consumption. Various circuit-level and technology-based methods have been proposed to reduce cache leakage. However, from a system designer's perspective, for a particular configuration and workload, it is not clear which method will be most effective. In this work, we make a detailed evaluation and comparison of cache energy reduction techniques. Our results show that when energy is very scarce and important, the best results are obtained with highly energy efficient Tunnel-FET caches. When the available energy increases and performance becomes a bigger concern, there is no single winner. While a small number of capacity sensitive workloads benefit from increased capacity of STT-RAM caches, latency sensitive workloads prefer solutions with smaller latency penalties such as drowsy caches.

[1]  Jan M. Rabaey,et al.  SRAM leakage suppression by minimizing standby supply voltage , 2004, International Symposium on Signals, Circuits and Systems. Proceedings, SCS 2003. (Cat. No.03EX720).

[2]  David Blaauw,et al.  Circuit and microarchitectural techniques for reducing cache leakage power , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[3]  Shin'ichiro Mutoh,et al.  1-V power supply high-speed digital circuit technology with multithreshold-voltage CMOS , 1995, IEEE J. Solid State Circuits.

[4]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[5]  M. Hosomi,et al.  A novel nonvolatile memory with spin torque transfer magnetization switching: spin-ram , 2005, IEEE InternationalElectron Devices Meeting, 2005. IEDM Technical Digest..

[6]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[7]  Chita R. Das,et al.  Architecting on-chip interconnects for stacked 3D STT-RAM caches in CMPs , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[8]  Tao Zhang,et al.  MorphCache: A Reconfigurable Adaptive Multi-level Cache hierarchy , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[9]  T. Mayer,et al.  Experimental demonstration of 100nm channel length In0.53Ga0.47As-based vertical inter-band tunnel field effect transistors (TFETs) for ultra low-power logic and SRAM applications , 2009, 2009 IEEE International Electron Devices Meeting (IEDM).

[10]  Cong Xu,et al.  NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Mahmut T. Kandemir,et al.  Improving energy efficiency of multi-threaded applications using heterogeneous CMOS-TFET multicores , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[12]  Narayanan Vijaykrishnan,et al.  Variation-tolerant ultra low-power heterojunction tunnel FET SRAM design , 2011, 2011 IEEE/ACM International Symposium on Nanoscale Architectures.

[13]  David Blaauw,et al.  A Sub-200mV 6T SRAM in 0.13μm CMOS , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[14]  S. Ikeda,et al.  2 Mb SPRAM (SPin-Transfer Torque RAM) With Bit-by-Bit Bi-Directional Current Write and Parallelizing-Direction Current Read , 2008, IEEE Journal of Solid-State Circuits.

[15]  Norman P. Jouppi,et al.  Architecting Efficient Interconnects for Large Caches with CACTI 6.0 , 2008, IEEE Micro.

[16]  Kaushik Roy,et al.  A 160 mV, fully differential, robust schmitt trigger based sub-threshold SRAM , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[17]  Yiran Chen,et al.  VOSCH: Voltage scaled cache hierarchies , 2007, 2007 25th International Conference on Computer Design.

[18]  Shoji Ikeda,et al.  2Mb Spin-Transfer Torque RAM (SPRAM) with Bit-by-Bit Bidirectional Current Write and Parallelizing-Direction Current Read , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[19]  David Blaauw,et al.  Drowsy caches: simple techniques for reducing leakage power , 2002, ISCA.

[20]  William H. Mangione-Smith,et al.  The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.