Boosting the Performance of Shared Memory Multiprocessors

Proposed hardware optimizations to CC-NUMA machines-shared memory multiprocessors that use cache consistency protocols-can shorten the time processors lose because of cache misses and invalidations. The authors look at cost-performance trade-offs for each.

[1]  Erik Hagersten,et al.  DDM - A Cache-Only Memory Architecture , 1992, Computer.

[2]  Robert J. Fowler,et al.  Adaptive cache coherency for detecting migratory shared data , 1993, ISCA '93.

[3]  Anoop Gupta,et al.  Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.

[4]  Anna R. Karlin,et al.  Two adaptive hybrid cache coherency protocols , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.

[5]  Michel Dubois,et al.  Essential Misses and Data Traffic in Coherence Protocols , 1995, J. Parallel Distributed Comput..

[6]  Per Stenström,et al.  Using Write Caches to Improve Performance of Cache Coherence Protocols in Shared-Memory Multiprocessors , 1995, J. Parallel Distributed Comput..

[7]  Anoop Gupta,et al.  The Stanford Dash multiprocessor , 1992, Computer.

[8]  Jean-Loup Baer,et al.  A performance study of software and hardware data prefetching schemes , 1994, ISCA '94.

[9]  Michel Dubois,et al.  Sequential Hardware Prefetching in Shared-Memory Multiprocessors , 1995, IEEE Trans. Parallel Distributed Syst..

[10]  Michel Dubois,et al.  Implementation and evaluation of update-based cache protocols under relaxed memory consistency models , 1995, Future Gener. Comput. Syst..

[11]  Michel Dubois,et al.  Memory Access Dependencies in Shared-Memory Multiprocessors , 1990, IEEE Trans. Software Eng..

[12]  Mats Brorsson,et al.  An adaptive cache coherence protocol optimized for migratory sharing , 1993, ISCA '93.

[13]  Per Stenström,et al.  The Cachemire Test Bench A Flexible And Effective Approach For Simulation Of Multiprocessors , 1993, [1993] Proceedings 26th Annual Simulation Symposium.