Design of the Munin Distributed Shared Memory System

Software distributed shared memory (DSM) is a software abstraction of shared memory on a distributed memory machine. The key problem in building an efficient DSM system is to reduce the amount of communication needed to keep the distributed memories consistent. The Munin DSM system incorporates a number of novel techniques for doing so, including the use of multiple consistency protocols and support for multiple concurrent writer protocols. Due to these and other features, Munin is able to achieve high performance on a variety of numerical applications. This paper contains a detailed description of the design and implementation of the Munin prototype, with special emphasis given to its novel write shared protocol. Furthermore, it describes a number of lessons that we learned from our experience with the prototype implementation that are relevant to the implementation of future DSMs.

[1]  Anoop Gupta,et al.  Memory-reference characteristics of multiprocessor applications under MACH , 1988, SIGMETRICS '88.

[2]  Mosur Ravishankar,et al.  PLUS: a distributed shared-memory system , 1990, ISCA '90.

[3]  James R. Larus,et al.  Tempest and typhoon: user-level shared memory , 1994, ISCA '94.

[4]  Anoop Gupta,et al.  Analysis of cache invalidation patterns in multiprocessors , 1989, ASPLOS III.

[5]  Anant Agarwal,et al.  Multiprocessor cache analysis using ATUM , 1988, ISCA '88.

[6]  Brett D. Fleisch,et al.  Mirage: a coherent distributed shared memory design , 1989, SOSP '89.

[7]  Ronald G. Minnich,et al.  The Mether System: Distributed Shared Memory for SunOS 4.0 , 1993 .

[8]  John B. Carter,et al.  Efficient distributed shared memory based on multi-protocol release consistency , 1995 .

[9]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1989, TOCS.

[10]  Robert J. Fowler,et al.  A performance evaluation of optimal hybrid cache coherency protocols , 1992, ASPLOS V.

[11]  R.H. Katz,et al.  A characterization of sharing in parallel programs and its application to coherency protocol evaluation , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[12]  Brian N. Bershad,et al.  The Midway distributed shared memory system , 1993, Digest of Papers. Compcon Spring.

[13]  Jeffrey S. Chase,et al.  The Amber system: parallel programming on a network of multiprocessors , 1989, SOSP '89.

[14]  Mustaque Ahamad,et al.  Implementing and programming causal distributed shared memory , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[15]  Michael Stumm,et al.  Algorithms implementing distributed shared memory , 1990, Computer.

[16]  Willy Zwaenepoel,et al.  Adaptive software cache management for distributed shared memory architectures , 1990, ISCA '90.

[17]  Andrew P. Black,et al.  Fine-grained mobility in the Emerald system , 1987, TOCS.

[18]  Willy Zwaenepoel,et al.  Adaptive software cache management for distributed shared memory architectures , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[19]  Willy Zwaenepoel,et al.  Techniques for reducing consistency-related communication in distributed shared-memory systems , 1995, TOCS.

[20]  Willy Zwaenepoel,et al.  The distributed V kernel and its performance for diskless workstations , 1983, SOSP '83.

[21]  Willy Zwaenepoel,et al.  Implementation and performance of Munin , 1991, SOSP '91.

[22]  Brian N. Bershad,et al.  Software write detection for a distributed shared memory , 1994, OSDI '94.

[23]  Bill Nitzberg,et al.  Distributed shared memory: a survey of issues and algorithms , 1991, Computer.

[24]  Henri E. Bal,et al.  Distributed Programming with Shared Data , 1991, Comput. Lang..