Partial tag comparison: a new technology for power-efficient set-associative cache designs

We call comparing a small part of two tags a Partial Comparison. In this paper, we show that the partial comparison method can filter out most of the unmatched tag comparisons for different cache configurations. The delay of the partial comparison operation is only 60% of that of the full comparison. This paper proposes to use the partial comparison technique to reduce energy dissipation on major cache components of set-associative caches. We show that when adaptive schemes based on partial comparison are applied to amplifiers and bit-lines, the power consumption of set-associative caches is similar to that of direct-mapped caches. We used the CACTI cache model to evaluate the proposed cache architecture and the Simplescalar CPU simulator to produce final results. The power simulation results suggest that the proposed set-associative cache architecture is very power-efficient. In the simulated cache configurations, 25%-60% of cache accessing energy was saved.

[1]  Honesty C. Young,et al.  Improving cache performance with balanced tag and data paths , 1996, ASPLOS VII.

[2]  A. Argawal,et al.  Cache performance of operating systems and multiprogramming , 1988 .

[3]  Ikuya Kawasaki,et al.  SH3: high code density, low power , 1995, IEEE Micro.

[4]  Kazuaki Murakami,et al.  Way-predicting set-associative cache for high performance and low energy consumption , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[5]  Mark Horowitz,et al.  Cache performance of operating system and multiprogramming workloads , 1988, TOCS.

[6]  T. N. Vijaykumar,et al.  Reactive-associative caches , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[7]  Alvin M. Despain,et al.  Cache design trade-offs for power and performance optimization: a case study , 1995, ISLPED '95.

[8]  Terry Lyon,et al.  Data Cache design considerations for the Itanium/sub /spl reg// 2 Processor , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[9]  Lizy Kurian John,et al.  Modeling and analysis of the difference-bit cache , 1998, Proceedings of the 8th Great Lakes Symposium on VLSI (Cat. No.98TB100222).

[10]  Kanad Ghose,et al.  Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[11]  Anant Agarwal,et al.  Column-associative caches: a technique for reducing the miss rate of direct-mapped caches , 1993, ISCA '93.

[12]  Lishing Liu Cache designs with partial address matching , 1994, MICRO 27.

[13]  Kaushik Roy,et al.  Reducing set-associative cache energy via way-prediction and selective direct-mapping , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[14]  Michael L. Scott,et al.  Integrating adaptive on-chip storage structures for reduced dynamic power , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.

[15]  Dirk Grunwald,et al.  Predictive sequential associative cache , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.

[16]  J.J. Navarro,et al.  The Difference-Bit Cache , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[17]  Kimming So,et al.  Cache design of a sub-micron CMOS system/370 , 1987, ISCA '87.