Tag Overflow Buffering: Reducing Total Memory Energy by Reduced-Tag Matching

We propose a novel energy-efficient cache architecture based on a matching mechanism that uses a reduced number of tag bits. The idea behind the proposed architecture is based on moving a large subset of the tag bits from the cache into an external register (called the Tag Overflow Buffer) that serves as an identifier of the current locality of the memory references. Dynamic energy efficiency is achieved by accessing, for most of the memory references, a reduced-tag cache; furthermore, because of the reduced number of tag bits, leakage energy is also reduced as a by-product. We achieve average energy savings ranging from 16% to 40% (depending on different cache structural parameters) on total (i.e., static and dynamic) cache energy, and measured on a standard suite of embedded applications.

[1]  Barry Fagin,et al.  Partial resolution in branch target buffers , 1995, MICRO 1995.

[2]  Frank Vahid,et al.  A self-tuning cache architecture for embedded systems , 2004 .

[3]  Wen-mei W. Hwu,et al.  Run-time spatial locality detection and optimization , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[4]  Mahmut T. Kandemir,et al.  Memory system optimization of embedded software , 2003, Proc. IEEE.

[5]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[6]  Dong-Ik Lee,et al.  Cost effective value prediction microarchitecture using partial-tag and narrow-width operands , 2001, 2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (IEEE Cat. No.01CH37233).

[7]  David A. Patterson,et al.  Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .

[8]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[9]  Bill Moyer,et al.  A low power unified cache architecture providing power and performance flexibility , 2000, ISLPED'00: Proceedings of the 2000 International Symposium on Low Power Electronics and Design (Cat. No.00TH8514).

[10]  Peter Petrov,et al.  Towards effective embedded processors in codesigns: customizable partitioned caches , 2001, Ninth International Symposium on Hardware/Software Codesign. CODES 2001 (IEEE Cat. No.01TH8571).

[11]  Luca Benini,et al.  Memory design techniques for low energy embedded systems , 2002 .

[12]  Lishing Liu Partial address directory for cache access , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[13]  Nikil D. Dutt,et al.  Automatic tuning of two-level caches to embedded applications , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[14]  Rajeev Balasubramonian,et al.  Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures , 2000, MICRO 33.

[15]  Yuzhuo Fu,et al.  Low power set-associative cache with single-cycle partial tag comparison , 2005, 2005 6th International Conference on ASIC.

[16]  Zhiyong Xu,et al.  Partial tag comparison: a new technology for power-efficient set-associative cache designs , 2004, 17th International Conference on VLSI Design. Proceedings..

[17]  Jen-Shiun Chiang,et al.  Low-power way-predicting cache using valid-bit pre-decision for parallel architectures , 2005, 19th International Conference on Advanced Information Networking and Applications (AINA'05) Volume 1 (AINA papers).

[18]  Peter Petrov,et al.  Data cache energy minimizations through programmable tag size matching to the applications , 2001, International Symposium on System Synthesis (IEEE Cat. No.01EX526).

[19]  Bin Liu,et al.  Low Energy Partial Tag Comparison Cache Using Valid-bit Pre-decision , 2006, TENCON 2006 - 2006 IEEE Region 10 Conference.