High Performance Memory Systems

* Introduction * Coherence, synchronization, and allocation * Power-aware, reliable, and reconfigurable memory * Software-based memory tuning * Architecture-based tuning * Workload considerations * Index

[1]  Robert J. Fowler,et al.  MINT: a front end for efficient simulation of shared-memory multiprocessors , 1994, Proceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[2]  Michael Franz,et al.  The Case for Dynamic Optimization - Improving Memory-Hierarchy Performance by Continuously Adapting , 1999 .

[3]  Morten Kjelsø,et al.  Empirical study of memory-data: characteristics and compressibility , 1998 .

[4]  Ben Zorn,et al.  Predicting References to Dynamically Allocated Objects , 1997 .

[5]  Edward A. Lee,et al.  DSP Processor Fundamentals , 1997 .

[6]  Paul R. Wilson,et al.  Dynamic Storage Allocation: A Survey and Critical Review , 1995, IWMM.

[7]  Vivek Sarkar,et al.  Baring It All to Software: Raw Machines , 1997, Computer.

[8]  Josep Torrellas,et al.  Cache optimization for memory-resident decision support commercial workloads , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[9]  Michael E. Wazlowski,et al.  IBM Memory Expansion Technology (MXT) , 2001, IBM J. Res. Dev..

[10]  Michael Stonebraker,et al.  The Design of XPRS , 1988, VLDB.

[11]  Norman P. Jouppi,et al.  Reconfigurable caches and their application to media processing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[12]  John KalaIniitianos,et al.  Accurate Simulation and Evaluation of Code Reordering , 2000 .

[13]  Robert W. Brodersen,et al.  Speech recognition for portable multimedia terminals , 1996 .

[14]  Josep Torrellas,et al.  The memory performance of DSS commercial workloads in shared-memory multiprocessors , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[15]  Paul R. Wilson,et al.  Effective “static-graph” reorganization to improve locality in garbage-collected systems , 1991, PLDI '91.

[16]  Mark B. Reinhold,et al.  Cache performance of garbage-collected programs , 1994, PLDI '94.

[17]  David R. Kaeli,et al.  Analysis of Temporal-Based Program Behavior for Improved Instruction Cache Performance , 1999, IEEE Trans. Computers.

[18]  Jan M. Rabaey,et al.  Parallel DSP with memory and I/O processors , 1998, Optics & Photonics.

[19]  Thomas M. Conte,et al.  Compiler-driven cached code compression schemes for embedded ILP processors , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[20]  Michael Stonebraker,et al.  The Design of the POSTGRES Storage System , 1988, VLDB.

[21]  David R. Kaeli,et al.  Predicting indirect branches via data compression , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.