P e rformanc e C ommunic ation

Mernory-based messaging, passing mes- sages between programs through a shared memory segment, is a recognized technique for efficient com- munication that takes direct advantage of memory system performance. However, the conventional oper- ating system and hardware support for this approach is inefflcient, especially in large-scale multiprocessor systems. This paper describes interface, software and hard- ware optimizations for memory-based messaging that efficiently exploit the basic mechanisms of the memory system to provide superior communication perfor- mance. We describe the overall model of optimized memory-based messaging, its implementation in an operating system kernel and hardware support for this approach in a scalable multiprocessor architec- ture. The optimizations include address-valued signals, message-oriented memory consistency and automatic signaling on write. Performance evaluations show these extensions provide a three-to-five-fold improvement in communication performance over a comparable software-only implementation.

[1]  Bryan S. Rosenburg Low-synchronization translation lookaside buffer consistency in large-scale shared-memory multiprocessors , 1989, SOSP '89.

[2]  Per Brinch Hansen,et al.  Operating System Principles , 1973 .

[3]  David R. Cheriton,et al.  Specializing object-oriented RPC for functionality and performance , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[4]  Maurice J. Bach The Design of the UNIX Operating System , 1986 .

[5]  Robert P. Fitzgerald,et al.  A performance evaluation of the integration of virtual memory management and inter-process communication in accent (operating system, copy-on-write) , 1986 .

[6]  David R. Cheriton,et al.  Binary Emulation of UNIX Using the V Kernel , 1990, USENIX Summer.

[7]  Michael Burrows,et al.  Performance of Firefly RPC , 1989, SOSP '89.

[8]  Robbert van Renesse,et al.  Experiences with the Amoeba distributed operating system , 1990, CACM.

[9]  Anoop Gupta,et al.  Performance evaluation of memory consistency models for shared-memory multiprocessors , 1991, ASPLOS IV.

[10]  Gary Scott Delp The architecture and implementation of MEMNET: a high--speed shared-memory computer communication network , 1988 .

[11]  Anant Agarwal,et al.  Integrating message-passing and shared-memory: early experience , 1993, SIGP.

[12]  D. R. Cheriton,et al.  Multi-level shared caching techniques for scalability in VMP-M/C , 1989, ISCA '89.

[13]  Anant Agarwal,et al.  LimitLESS directories: A scalable cache coherence scheme , 1991, ASPLOS IV.

[14]  Carey L. Williamson,et al.  Network measurement of the VMTP request-response protocol in the V distributed system , 1987, SIGMETRICS '87.

[15]  Hendrik A. Goosen,et al.  Paradigm: a highly scalable shared-memory multicomputer architecture , 1991, Computer.