Dynamically variable line-size cache exploiting high on-chip memory bandwidth of merged DRAM/logic LSIs

This paper proposes a novel cache architecture suitable for merged DRAM/logic LSIs, which is called "dynamically variable line-size cache (D-VLS cache)". The D-VLS cache can optimize its line-size according to the characteristic of programs, and attempts to improve the performance by exploiting the high on-chip memory bandwidth. In our evaluation, it is observed that the performance improvement achieved by a direct-mapped D-VLS cache is about 27%, compared to a conventional direct-mapped cache with fixed 32-byte lines.

[1]  Norman P. Jouppi,et al.  CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.

[2]  Kazuaki Murakami,et al.  High bandwidth, variable line-size cache architecture for merged DRAM/Logic LSIs , 1998 .

[3]  James R. Larus,et al.  Wisconsin Architectural Research Tool Set , 1993, CARN.

[4]  Fong Pong,et al.  Missing the Memory Wall: The Case for Processor/Memory Integration , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[5]  Michel Dubois,et al.  Fixed and Adaptive Sequential Prefetching in Shared Memory Multiprocessors , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[6]  Sanjeev Kumar,et al.  Exploiting spatial locality in data caches using spatial footprints , 1998, ISCA.

[7]  D. Burger,et al.  Memory Bandwidth Limitations of Future Microprocessors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[8]  D J Evans,et al.  Parallel processing , 1986 .

[9]  K. Yelick,et al.  Intelligent RAM (IRAM): chips that remember and compute , 1997, 1997 IEEE International Solids-State Circuits Conference. Digest of Technical Papers.

[10]  Antonio Gonzalez,et al.  A data cache with multiple caching strategies tuned to different types of locality , 1995, International Conference on Supercomputing.

[11]  Andreas Nowatzyk,et al.  Missing the Memory Wall: The Case for Processor/Memory Integration , 1996, ISCA.

[12]  Sally A. McKee,et al.  Hitting the memory wall: implications of the obvious , 1995, CARN.

[13]  Mark D. Hill,et al.  A case for direct-mapped caches , 1988, Computer.

[14]  K. Murakami,et al.  Parallel processing RAM chip with 256 Mb DRAM and quad processors , 1997, 1997 IEEE International Solids-State Circuits Conference. Digest of Technical Papers.

[15]  Mateo Valero,et al.  A Data Cache with Multiple Caching Strategies Tuned to Different Types of Locality , 1995, International Conference on Supercomputing.

[16]  Kenneth M. Wilson,et al.  Designing High Bandwidth On-chip Caches , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[17]  Wen-mei W. Hwu,et al.  Run-time spatial locality detection and optimization , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[18]  Thomas J. LeBlanc,et al.  Adjustable block size coherent caches , 1992, ISCA '92.