Scalable shared-memory architectures. Introduction to the minitrack

The single address-space that shared-memory architectures offer simplifies programming, problem partitioning, and dynamic load balancing as compared to other programming models for parallel computing systems such as e.g. Message passing. Unfortunately, as we scale shared-memory architectures to large configurations, the resulting memory system latencies may limit their performance potentials. Finding cost-effective solutions to the memory-system latency issue has become an important research objective and is the main focus of this minitrack. Loosely speaking, scalability for these systems refers to optimizing both the performance and the implementation cost. While it is not meaningful to strictly define the term scalability, it is an important intuitive goal when evaluating new shared-memory architectures. >

[1]  Per Stenström,et al.  A Survey of Cache Coherence Schemes for Multiprocessors , 1990, Computer.

[2]  Josep Torrellas,et al.  False Sharing ans Spatial Locality in Multiprocessor Caches , 1994, IEEE Trans. Computers.

[3]  Jean-Loup Baer,et al.  A performance study of software and hardware data prefetching schemes , 1994, ISCA '94.

[4]  Anoop Gupta,et al.  Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.

[5]  Anant Agarwal,et al.  Directory-based cache coherence in large-scale multiprocessors , 1990, Computer.

[6]  Josep Torrellas,et al.  Evaluating the Performance of Cache-Affinity Scheduling in Shared-Memory Multiprocessors , 1995, J. Parallel Distributed Comput..

[7]  Anant Agarwal,et al.  Performance Tradeoffs in Multithreaded Processors , 1992, IEEE Trans. Parallel Distributed Syst..

[8]  Anoop Gupta,et al.  Comparative performance evaluation of cache-coherent NUMA and COMA architectures , 1992, ISCA '92.

[9]  Anoop Gupta,et al.  Comparative evaluation of latency reducing and tolerating techniques , 1991, ISCA '91.

[10]  Mary K. Vernon,et al.  Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS 1989.

[11]  Michael L. Scott,et al.  Simple but effective techniques for NUMA memory management , 1989, SOSP '89.

[12]  M. Hill,et al.  Weak ordering-a new definition , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[13]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[14]  Alexander V. Veidenbaum,et al.  Compiler-directed cache management in multiprocessors , 1990, Computer.

[15]  Anoop Gupta,et al.  Performance evaluation of memory consistency models for shared-memory multiprocessors , 1991, ASPLOS IV.