Breaking Address Mapping Symmetry at Multi-levels of Memory Heirarchy to Reduce DRAM Row-buffer Conflicts
暂无分享,去创建一个
[1] James E. Smith,et al. Performance Of Cached Dram Organizations In Vector Supercomputers , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[2] C.-L. Chen,et al. Analysis of vector access performance on skewed interleaved memory , 1989, ISCA '89.
[3] Douglas W. Clark,et al. A Characterization of Processor Performance in the vax-11/780 , 1984, ISCA '84.
[4] Trevor N. Mudge,et al. A performance comparison of contemporary DRAM architectures , 1999, ISCA.
[5] John L. Henning. SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.
[6] Kevin Skadron,et al. Design issues and tradeoffs for write buffers , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.
[7] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[8] Gurindar S. Sohi. High-Bandwidth Interleaved Memories for Vector Processors-A Simulation Study , 1993, IEEE Trans. Computers.
[9] André Seznec,et al. A case for two-way skewed-associative caches , 1993, ISCA '93.
[10] Teruo Tanaka,et al. Scalable parallel memory architecture with a skew scheme , 1993, ICS '93.
[11] Q. S. Gao. The Chinese remainder theorem and the prime memory system , 1993, ISCA '93.
[12] André Seznec,et al. Interleaved parallel schemes: improving memory throughput on supercomputers , 1992, ISCA '92.
[13] William J. Dally,et al. Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[14] Zhao Zhang,et al. A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality , 2000, MICRO 33.
[15] A. Gonzalez,et al. Cache sensitive module scheduling , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[16] John H. Zurawski,et al. The Design and Verification of the AlphaStation 600 5-series Workstation , 1995, Digit. Tech. J..
[17] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.
[18] Xiaobo Li,et al. XOR Storage Schemes for Frequently Used Data Patterns , 1995, J. Parallel Distributed Comput..
[19] Eduard Ayguadé,et al. Conflict-free access of vectors with power-of-two strides , 1992, ICS '92.
[20] Mateo Valero,et al. Eliminating cache conflict misses through XOR-based placement functions , 1997, ICS '97.
[21] D. Burger,et al. Memory Bandwidth Limitations of Future Microprocessors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[22] B. Ramakrishna Rau,et al. The Cydram 5 Stride-Insensitive Memory System , 1989, ICPP.
[23] Wei-Fen Lin,et al. Reducing DRAM latencies with an integrated memory hierarchy design , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[24] F. Jesús Sánchez,et al. Cache Sensitive Modulo Scheduling , 1997, MICRO.
[25] B. Ramakrishna Rau,et al. Pseudo-randomly interleaved memory , 1991, ISCA '91.
[26] David T. Harper,et al. Performance Evaluation of Vector Accesses in Parallel Memories Using a Skewed Storage Scheme , 1986, ISCA.