A way-halting cache for low-energy high-performance systems

Caches contribute to much of a microprocessor system's power and energy consumption. We have developed a new cache architecture, called a way-halting cache, that reduces energy while imposing no performance overhead. Our way-halting cache is a four-way set-associative cache that stores the four lowest-order bits of all ways' tags into a fully associative memory, which we call the halt tag array. The look-up in the hall tag array is done in parallel with, and is no slower than, the set-index decoding. The hall tag array pre-determines which tags cannot match due to their low-order four bits mismatching. Further accesses to ways with known mismatching tags are then halted, thus saving power. Our halt tag array has an additional feature of using static logic only, rather than dynamic logic used in highly associative caches. We provide data from experiments on 17 benchmarks drawn from MediaBench and Spec 2000, based on our layouts in 0.18 micron CMOS technology, On average, 55% savings of memory-access related energy were obtained over a conventional four-way set-associative cache. We show that energy savings are greater than previous methods, and nearly twice that of highly-associative caches, while imposing no performance overhead and only 2% cache area overhead.

[1]  Aristides Efthymiou,et al.  An adaptive serial-parallel CAM architecture for low-power cache blocks , 2002, ISLPED '02.

[2]  Kaushik Roy,et al.  Reducing set-associative cache energy via way-prediction and selective direct-mapping , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[3]  Bharadwaj Amrutur,et al.  A replica technique for wordline and sense control in low-power SRAM's , 1998, IEEE J. Solid State Circuits.

[4]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[5]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[6]  Frank Vahid,et al.  A highly configurable cache architecture for embedded systems , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..

[7]  Ikuya Kawasaki,et al.  SH3: high code density, low power , 1995, IEEE Micro.

[8]  Simon Segars Low power design techniques for microprocessors , 2000 .

[9]  Bill Moyer,et al.  A low power unified cache architecture providing power and performance flexibility , 2000, ISLPED'00: Proceedings of the 2000 International Symposium on Low Power Electronics and Design (Cat. No.00TH8514).

[10]  Dirk Grunwald,et al.  Predictive sequential associative cache , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.

[11]  Michael Zhang,et al.  Highly-Associative Caches for Low-Power Processors , 2000 .