Adopting TLB index-based tagging to data caches for tag energy reduction

Conventional cache tag matching is based on addresses to identify requested data. However, this address-based tagging scheme is not efficient because unnecessarily many tag bits are used. Previous studies show that TLB index-based tagging (TLBIT) can be used in caches because there are not many different tags at a moment due to spatial locality, and those tags are conventionally captured by TLBs. In this paper, we show that directly adopting TLBIT is not effective for data caches because TLBIT incurs large overheads in terms of performance and energy consumption due to cache line searches and invalidations. To achieve true potential of TLBIT, we propose three novel techniques: search zone, c-LRU and TLB buffer. Search zone reduces unnecessary cache line searches and c-LRU reduces cache line invalidations. Finally, TLB buffer prevents immediate cache line invalidations on TLB misses. From our experiments, the proposed techniques reduce overall dynamic energy consumption of the data cache by 8.9% on average. Performance impact is small, less than 0.2% on average.

[1]  Pradip Bose,et al.  Microarchitectural techniques for power gating of execution units , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).

[2]  Jong Wook Kwak,et al.  Compressed tag architecture for low-power embedded cache systems , 2010, J. Syst. Archit..

[3]  Soontae Kim,et al.  TLB index-based tagging for cache energy reduction , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[4]  Peter Petrov,et al.  Data cache energy minimizations through programmable tag size matching to the applications , 2001, International Symposium on System Synthesis (IEEE Cat. No.01EX526).

[5]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[6]  Massimo Poncino,et al.  Tag overflow buffering: an energy-efficient cache architecture , 2005, Design, Automation and Test in Europe.

[7]  Jun Yang,et al.  Low cost instruction cache designs for tag comparison elimination , 2003, ISLPED '03.

[8]  Peter Petrov,et al.  Dynamic Tag Reduction for Low-Power Caches in Embedded Systems with Virtual Memory , 2006, International Journal of Parallel Programming.

[9]  Soontae Kim,et al.  SimTag: Exploiting tag bits similarity to improve the reliability of the data caches , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[10]  Kazuaki Murakami,et al.  A history-based I-cache for low-energy multimedia applications , 2002, ISLPED '02.

[11]  Peter Petrov,et al.  Energy frugal tags in reprogrammable I-caches for application-specific embedded processors , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).

[12]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[13]  Krste Asanovic,et al.  Direct addressed caches for reduced power consumption , 2001, MICRO.

[14]  Margaret Martonosi,et al.  XTREM: a power simulator for the Intel XScale® core , 2004, LCTES '04.

[15]  S. Seznec,et al.  Don't Use the Page Number, but a Pointer to It , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).