Managing concurrent access for shared memory active messages

Passing messages through shared memory plays an important role in symmetric multiprocessors and on Clumps. The management of concurrent access to message queues is an important aspect of design for shared memory message passing systems. Using both microbenchmarks and applications, the paper compares the performance of concurrent access algorithms for passing active messages on a Sun Enterprise 5000 server. The paper presents a new lock free algorithm that provides many of the advantages of non blocking algorithms while avoiding the overhead of true non blocking behavior. The lock free algorithm couples synchronization tightly to the data structure and demonstrates application performance superior to all others studied. The success of this algorithm implies that other practical problems might also benefit from a reexamination of the non blocking literature.

[1]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[2]  Philip Heidelberger,et al.  Message proxies for efficient, protected communication on SMP clusters , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[3]  David R. Cheriton,et al.  Optimized Memory-Based Messaging: Leveraging the Memory System for High-Performance Communication , 1994, Comput. Syst..

[4]  T. von Eicken,et al.  Parallel programming in Split-C , 1993, Supercomputing '93.

[5]  Eric A. Brewer,et al.  How to get good performance from the CM-5 data network , 1994, Proceedings of 8th International Parallel Processing Symposium.

[6]  Eric A. Brewer,et al.  Remote queues: exposing message queues for optimization and atomicity , 1995, SPAA '95.

[7]  Richard P. Martin,et al.  Assessing Fast Network Interfaces , 1996, IEEE Micro.

[8]  David E. Culler,et al.  Active message applications programming interface and communication subsystem organization , 1995 .

[9]  David P. Reed,et al.  Synchronization with eventcounts and sequencers , 1979, CACM.

[10]  Andrew A. Chien,et al.  A comparison of architectural support for messaging in the TMC CM-5 and the Cray T3D , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[11]  David R. Cheriton,et al.  The synergy between non-blocking synchronization and operating system structure , 1996, OSDI '96.

[12]  John D. Valois Implementing Lock-Free Queues , 1994 .

[13]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[14]  Erik Hagersten,et al.  Gigaplane: A High Performance Bus for Large SMPs , 2003 .

[15]  Thomas E. Anderson,et al.  The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..

[16]  Maurice Herlihy,et al.  Impossibility and universality results for wait-free synchronization , 1988, PODC '88.

[17]  David E. Culler,et al.  Active Message Applications Programming Interface , 1996 .

[18]  Gregory T. Byrd,et al.  Models of Communication Latency in Shared Memory Multiprocessors , 1993 .

[19]  Maurice Herlihy,et al.  A methodology for implementing highly concurrent data objects , 1993, TOPL.

[20]  David E. Culler,et al.  Multi Protocol Active Messages on a Cluster of SMP , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[21]  Andrea C. Arpaci-Dusseau,et al.  Parallel programming in Split-C , 1993, Supercomputing '93. Proceedings.

[22]  Calton Pu,et al.  A Lock-Free Multiprocessor OS Kernel , 1992, OPSR.

[23]  Maged M. Michael,et al.  Relative performance of preemption-safe locking and non-blocking synchronization on multiprogrammed shared memory multiprocessors , 1997, Proceedings 11th International Parallel Processing Symposium.