Trends in High-Performance, Low-Power Cache Memory Architectures( Special Issue on High-Performance and Low-Power Microprocessors)

One of uncompromising requirements from portable computing is energy efficiency, because that affects directly the battery life. On the other hand, portable computing will target more demanding applications, for example moving pictures, so that higher performance is still required. Cache memories have been employed as one of the most important components of computer systems. In this paper, we briefly survey architectural techniques for high performance, low power cache memories. key words: cache, low power, high performance, microprocessor, survey

[1]  David A. Rennels,et al.  Reducing the frequency of tag compares for low power I-cache design , 1995, ISLPED '95.

[2]  Wen-mei W. Hwu,et al.  Run-time Adaptive Cache Hierarchy Via Reference Analysis , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[3]  P. Boyle,et al.  A 300-MHz 115-W 32-b bipolar ECL microprocessor , 1993 .

[4]  Kazuaki Murakami,et al.  A Low-Power Instruction Cache Architecture Exploiting Program Execution Footprints , 2001, HPCA 2001.

[5]  Hiroto Yasuura,et al.  A power reduction technique with object code merging for application specific embedded processors , 2000, DATE '00.

[6]  A. Argawal,et al.  Cache performance of operating systems and multiprogramming , 1988 .

[7]  Wen-mei W. Hwu,et al.  Run-time Adaptive Cache Hierarchy Via Reference Analysis , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[8]  Jean-Loup Baer,et al.  Pursuing the performance potential of dynamic cache line sizes , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[9]  Jih-Kwon Peir,et al.  Capturing dynamic memory reference behavior with adaptive cache topology , 1998, ASPLOS VIII.

[10]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[11]  Kanad Ghose,et al.  Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[12]  Stephen J. Walsh,et al.  Pollution control caching , 1995, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors.

[13]  Simon Segars Low power design techniques for microprocessors , 2000 .

[14]  R. E. Kessler,et al.  Inexpensive implementations of set-associativity , 1989, ISCA '89.

[15]  Narayanan Vijaykrishnan,et al.  Multiple access caches: Energy implications , 2000, Proceedings IEEE Computer Society Workshop on VLSI 2000. System Design for a System-on-Chip Era.

[16]  Kimming So,et al.  Cache design of a sub-micron CMOS system/370 , 1987, ISCA '87.

[17]  Srinivas Devadas,et al.  Application-specific memory management for embedded systems using software-controlled caches , 2000, Proceedings 37th Design Automation Conference.

[18]  Kazuaki Murakami,et al.  Dynamically variable line-size cache exploiting high on-chip memory bandwidth of merged DRAM/logic LSIs , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[19]  Hiroshi Nakamura,et al.  Software Controlled Reconfigurable On-Chip Memory for High Performance Computing , 2000, Intelligent Memory Systems.

[20]  Chenxi Zhang,et al.  Two fast and high-associativity cache schemes , 1997, IEEE Micro.

[21]  Wen-mei W. Hwu,et al.  Achieving High Instruction Cache Performance With An Optimizing Compiler , 1989, The 16th Annual International Symposium on Computer Architecture.

[22]  Ibrahim N. Hajj,et al.  Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[23]  Kimiyoshi Usami,et al.  Low-power technique for on-chip memory using biased partitioning and access concentration , 2000, Proceedings of the IEEE 2000 Custom Integrated Circuits Conference (Cat. No.00CH37044).

[24]  Hiroyuki Tomiyama,et al.  Code placement techniques for cache miss rate reduction , 1997, TODE.

[25]  Gary S. Tyson,et al.  A modified approach to data cache management , 1995, MICRO 1995.

[26]  Guang R. Gao,et al.  A design framework for hybrid-access caches , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[27]  J.J. Navarro,et al.  The Difference-Bit Cache , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[28]  Uming Ko,et al.  Energy optimization of multi-level processor cache architectures , 1995, ISLPED '95.

[29]  Thomas J. LeBlanc,et al.  Adjustable block size coherent caches , 1992, ISCA '92.

[30]  K. Ghose,et al.  Analytical energy dissipation models for low power caches , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[31]  William H. Mangione-Smith,et al.  The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[32]  Anant Agarwal,et al.  Column-associative caches: a technique for reducing the miss rate of direct-mapped caches , 1993, ISCA '93.

[33]  K. Murakami,et al.  Parallel processing RAM chip with 256 Mb DRAM and quad processors , 1997, 1997 IEEE International Solids-State Circuits Conference. Digest of Technical Papers.

[34]  桜井 貴康 低消費電力、高速LSI技術 : Low-power high-speed LSI circuits & technology , 1998 .

[35]  Antonio Gonzalez,et al.  A data cache with multiple caching strategies tuned to different types of locality , 1995, International Conference on Supercomputing.

[36]  Nikil D. Dutt,et al.  Memory organization for improved data cache performance in embedded processors , 1996, Proceedings of 9th International Symposium on Systems Synthesis.

[37]  Sanjeev Kumar,et al.  Exploiting spatial locality in data caches using spatial footprints , 1998, ISCA.

[38]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[39]  Scott McFarling Cache replacement with dynamic exclusion , 1992, ISCA '92.

[40]  Dirk Grunwald,et al.  Predictive sequential associative cache , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.

[41]  Vasily G. Moshnyaga,et al.  Reducing cache engery through dual voltage supply , 2001, ASP-DAC '01.

[42]  Gary S. Tyson,et al.  Region-based caching: an energy-delay efficient memory architecture for embedded processors , 2000, CASES '00.

[43]  Gary S. Tyson,et al.  A modified approach to data cache management , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.

[44]  Rajesh K. Gupta,et al.  Adapting cache line size to application behavior , 1999, ICS '99.

[45]  Kazuaki Murakami,et al.  Performance/Energy Efficiency of Variable Line-Size Caches for Intelligent Memory Systems , 2000, Intelligent Memory Systems.

[46]  Stefanos Kaxiras,et al.  Cache-Line Decay: A Mechanism to Reduce Cache Leakage Power , 2000, PACS.

[47]  Alvin M. Despain,et al.  Cache design trade-offs for power and performance optimization: a case study , 1995, ISLPED '95.

[48]  Wen-mei W. Hwu,et al.  Run-time spatial locality detection and optimization , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[49]  R. Iris Bahar,et al.  The non-critical buffer: using load latency tolerance to improve data cache efficiency , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[50]  Alvin R. Lebeck,et al.  Load latency tolerance in dynamically scheduled processors , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[51]  Mark Horowitz,et al.  Cache performance of operating system and multiprogramming workloads , 1988, TOCS.

[52]  Kaushik Roy,et al.  An integrated circuit/architecture approach to reducing leakage in deep-submicron high-performance I-caches , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[53]  Mateo Valero,et al.  A Data Cache with Multiple Caching Strategies Tuned to Different Types of Locality , 1995, International Conference on Supercomputing.

[54]  Raminder Singh Bajwa,et al.  Instruction buffering to reduce power in processors for signal processing , 1997, IEEE Trans. Very Large Scale Integr. Syst..

[55]  Mateo Valero,et al.  Static locality analysis for cache management , 1997, Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques.

[56]  Fong Pong,et al.  Missing the Memory Wall: The Case for Processor/Memory Integration , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[57]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[58]  Nikil D. Dutt,et al.  Efficient utilization of scratch-pad memory in embedded processor applications , 1997, Proceedings European Design and Test Conference. ED & TC 97.

[59]  Lishing Liu Cache designs with partial address matching , 1994, MICRO 27.

[60]  David R. Kaeli,et al.  Temporal-based procedure reordering for improved instruction cache performance , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[61]  Ikuya Kawasaki,et al.  SH3: high code density, low power , 1995, IEEE Micro.

[62]  Kazuaki Murakami,et al.  Way-predicting set-associative cache for high performance and low energy consumption , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[63]  K. Yelick,et al.  Intelligent RAM (IRAM): chips that remember and compute , 1997, 1997 IEEE International Solids-State Circuits Conference. Digest of Technical Papers.

[64]  Ibrahim N. Hajj,et al.  Using dynamic cache management techniques to reduce energy in a high-performance processor , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[65]  Lizy Kurian John,et al.  Design and performance evaluation of a cache assist to implement selective caching , 1997, Proceedings International Conference on Computer Design VLSI in Computers and Processors.

[66]  Ibrahim N. Hajj,et al.  Energy and performance improvements in microprocessor design using a loop cache , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[67]  Richard E. Kessler,et al.  Inexpensive Implementations Of Set-Associativity , 1989, The 16th Annual International Symposium on Computer Architecture.