Design and optimization of large size and low overhead off-chip caches
暂无分享,去创建一个
[1] James E. Smith,et al. Performance Of Cached Dram Organizations In Vector Supercomputers , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[2] Christoforos E. Kozyrakis,et al. A case for intelligent RAM , 1997, IEEE Micro.
[3] Christoforos Kozyrakis,et al. A Media-Enhanced Vector Architecture for Embedded Memory Systems , 1999 .
[4] Jignesh M. Patel,et al. Data prefetching by dependence graph precomputation , 2001, ISCA 2001.
[5] Gary S. Tyson,et al. A modified approach to data cache management , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[6] James R. Goodman,et al. Instruction Cache Replacement Policies and Organizations , 1985, IEEE Transactions on Computers.
[7] Michael Shantz,et al. Multi-level texture caching for 3D graphics hardware , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).
[8] Chun Chen,et al. The architecture of the DIVA processing-in-memory chip , 2002, ICS '02.
[9] Zhao Zhang,et al. Fine-grain priority scheduling on multi-channel memory systems , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[10] Doug Burger. System-Level Implications of Processor-Memory Integration , 1997 .
[11] Zarka Cvetanovic,et al. AlphaServer 4100 Performance Characterization , 1996, Digit. Tech. J..
[12] Anoop Gupta,et al. The Design and Analysis of a Cache Architecture for Texture Mapping , 1997, ISCA.
[13] Lance Hammond,et al. A Single Chip Multiprocessor Integrated with High Density DRAM , 1997 .
[14] Michael E. Wazlowski,et al. Pinnacle: IBM MXT in a Memory Controller Chip , 2001, IEEE Micro.
[15] Kenneth M. Wilson,et al. Designing High Bandwidth On-chip Caches , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[16] Graham Kirsch. Active memory: Micron's Yukon , 2003, Proceedings International Parallel and Distributed Processing Symposium.
[17] R. Norwood,et al. Memory-a new era of fast dynamic RAMs (for video applications) , 1992, IEEE Spectrum.
[18] Richard E. Kessler,et al. Evaluating stream buffers as a secondary cache replacement , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[19] Xiaowei Shen,et al. Performance of hardware compressed main memory , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[20] Zhao Zhang,et al. A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality , 2000, MICRO 33.
[21] A. Seznec,et al. Decoupled sectored caches: conciliating low tag implementation cost and low miss ratio , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[22] Wei-Fen Lin,et al. Reducing DRAM latencies with an integrated memory hierarchy design , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[23] André Seznec,et al. Decoupled sectored caches: conciliating low tag implementation cost , 1994, ISCA '94.
[24] Norman P. Jouppi,et al. Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .
[25] Wen-Hann Wang,et al. On the inclusion properties for multi-level cache hierarchies , 1988, ISCA '88.
[26] Qing Yang,et al. DCD --- Disk Caching Disk: A New Approach for Boosting I/O Performance , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[27] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[28] Wen-Hann Wang,et al. On the Inclusion Properties for Multi-Level Cache Hierarchies , 1988, ISCA.
[29] Nancy Warter-Perez,et al. Modulo scheduling with multiple initiation intervals , 1995, MICRO 1995.
[30] Stephen Richardson,et al. An Equal Area Comparison of Embedded DRAM and SRAM Memory Architectures for a Chip Multiprocessor , 2000 .
[31] Zhao Zhang,et al. Cached DRAM for ILP Processor Memory Access Latency Reduction , 2001, IEEE Micro.
[32] Jean-Loup Baer,et al. Modified LRU policies for improving second-level cache behavior , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[33] Hideto Hidaka,et al. The cache DRAM architecture: a DRAM with an on-chip cache memory , 1990, IEEE Micro.
[34] Rajeev Balasubramonian,et al. Dynamically allocating processor resources between nearby and distant ILP , 2001, ISCA 2001.
[35] James R. Goodman,et al. A study of instruction cache organizations and replacement policies , 1983, ISCA '83.
[36] Fong Pong,et al. Missing the Memory Wall: The Case for Processor/Memory Integration , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[37] Chi-Keung Luk,et al. Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.
[38] Trevor N. Mudge,et al. A performance comparison of contemporary DRAM architectures , 1999, ISCA.
[39] Gary S. Tyson,et al. Eager writeback-a technique for improving bandwidth utilization , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.
[40] Stéphan Jourdan,et al. Speculation techniques for improving load related instruction scheduling , 1999, ISCA.
[41] Leonid Oliker,et al. Memory-intensive benchmarks: IRAM vs. cache-based machines , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.
[42] Yiming Hu,et al. DCD—disk caching disk: a new approach for boosting I/O performance , 1996, ISCA '96.
[43] Alan Jay Smith,et al. Functional Implementation Techniques for CPU Cache Memories , 1999, IEEE Trans. Computers.
[44] Anoop Gupta,et al. Parallel computer architecture - a hardware / software approach , 1998 .
[45] Charles A. Hart. CDRAM in a unified memory architecture , 1994, Proceedings of COMPCON '94.
[46] Brad Calder,et al. A Decoupled Predictor-Directed Stream Prefetching Architecture , 2003, IEEE Trans. Computers.
[47] Gershon Kedem,et al. WCDRAM: A fully associative integrated Cached-DRAM with wide cache lines , 1997 .
[48] Naveen Cherukuri,et al. The IA-64 Itanium Processor Cartridge , 2001, IEEE Micro.