Reducing energy and delay using efficient victim caches

In this paper, we investigate methods for improving the hit rates in the first level of memory hierarchy. Particularly, we propose victim cache structures to reduce the number of accesses to more power consuming structures such as level 2 caches. We compare the proposed victim cache techniques to increasing the associativity or the size of the level 1 data cache and show that the enhanced victim cache technique yield better energy-delay and energy-delay-area products. We also propose techniques that predict the hit/miss behavior of the victim cache accesses and bypass the victim cache when a miss can be determined quickly. We report simulation results obtained from SimpleScalar/ARM modeling a representative Network Processor architecture. The simulations show that the victim cache is able to reduce the energy consumption by as much as 17.6% (8.6% on average) while reducing the execution time by as much as 8.4% (3.7% on average) for a set of representative applications.

[1]  D. B. Davis,et al.  Intel Corp. , 1993 .

[2]  Nikil D. Dutt,et al.  Reducing address bus transition for low power memory mapping , 1996, Proceedings ED&TC European Design and Test Conference.

[3]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[4]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[5]  Mary Jane Irwin,et al.  Energy-delay analysis for on-chip interconnect at the system level , 1999, Proceedings. IEEE Computer Society Workshop on VLSI '99. System Design: Towards System-on-a-Chip Paradigm.

[6]  Richard T. Witek,et al.  A 160 MHz 32 b 0.5 W CMOS RISC microprocessor , 1996, 1996 IEEE International Solid-State Circuits Conference. Digest of TEchnical Papers, ISSCC.

[7]  Wendong Hu,et al.  NetBench: a benchmarking suite for network processors , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[8]  Ibrahim N. Hajj,et al.  Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[9]  William H. Mangione-Smith,et al.  The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[10]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.

[11]  Srilatha Manne,et al.  Power and performance tradeoffs using various caching strategies , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[12]  Glenn Reinman,et al.  Just say no: benefits of early cache miss determination , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[13]  Dirk Grunwald,et al.  A comparison of software code reordering and victim buffers , 1999, CARN.

[14]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .