Evaluation of Communication Mechanisms in Invalidate-Based Shared Memory Multiprocessors

Producer-initiated mechanisms are added to invalidate- based systems to reduce communication latencies by transferring data as soon as it is produced. This paper compares the performance of three producer-initiated mechanisms: lock, deliver, and StreamLine. All three approaches out-perform invalidate with prefetch in most cases.

[1]  Anoop Gupta,et al.  Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors , 1991, J. Parallel Distributed Comput..

[2]  Gregory T. Byrd,et al.  Streamline: Cache-Based Message Passing in Scalable Multiprocessors , 1991, ICPP.

[3]  James R. Goodman,et al.  Efficient Synchronization: Let Them Eat QOLB , 1997, International Symposium on Computer Architecture.

[4]  Nakul P. Saraiya,et al.  Instrumented Architectural Simulation , 1987 .

[5]  Stein Gjessing,et al.  Hardware support for synchronization in the Scalable Coherent Interface (SCI) , 1994, Proceedings of 8th International Parallel Processing Symposium.

[6]  Pen-Chung Yew,et al.  Integrating Fine-Grained Message Passing in Cache Coherent Shared Memory Multiprocessors , 1996, J. Parallel Distributed Comput..

[7]  Joonwon Lee,et al.  Cache-Based Synchronization in Shared Memory Multiprocessors , 1996, J. Parallel Distributed Comput..

[8]  Michel Dubois,et al.  Concurrent Miss Resolution in Multiprocessor Caches , 1988, ICPP.

[9]  Sarita V. Adve,et al.  An evaluation of fine-grain producer-initiated communication in cache-coherent multiprocessors , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[10]  Anant Agarwal,et al.  Anatomy of a message in the Alewife multiprocessor , 1993, ICS '93.

[11]  Anand Sivasubramaniam,et al.  Architectural Mechanisms for Explicit Communication in Shared Memory Multiprocessors , 1995, SC.

[12]  Mary K. Vernon,et al.  Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS 1989.

[13]  D. Burger,et al.  Efficient Synchronization: Let Them Eat QOLB /sup1/ , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[14]  Anoop Gupta,et al.  The Stanford Dash multiprocessor , 1992, Computer.