TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems

TreadMarks is a distributed shared memory (DSM) system for standard Unix systems such as SunOS and Ultrix. This paper presents a performance evaluation of TreadMarks running on Ultrix using DECstation-5000/240's that are connected by a 100-Mbps switch-based ATM LAN and a 10-Mbps Ethernet. Our objective is to determine the efficiency of a user-level DSM implementation on commercially available workstations and operating systems. We achieved good speedups on the 8-processor ATM network for Jacobi (7.4), TSP (7.2), Quicksort (6.3), and ILINK (5.7). For a slightly modified version of Water from the SPLASH benchmark suite, we achieved only moderate speedups (4.0) due to the high communication and synchronization rate. Speedups decline on the 10-Mbps Ethernet (5.5 for Jacobi, 6.5 for TSP, 4.2 for Quicksort, 5.1 for ILINK, and 2.1 for Water), reflecting the bandwidth limitations of the Ethernet. These results support the contention that, with suitable networking technology, DSM is a viable technique for parallel computation on clusters of workstations. To achieve these speedups, TreadMarks goes to great lengths to reduce the amount of communication performed to maintain memory consistency. It uses a lazy implementation of release consistency, and it allows multiple concurrent writers to modify a page, reducing the impact of false sharing. Great care was taken to minimize communication overhead. In particular, on the ATM network, we used a standard low-level protocol, AAL3/4, bypassing the TCP/IP protocol stack. Unix communication overhead, however, remains the main obstacle in the way of better performance for programs like Water. Compared to the Unix communication overhead, memory management cost (both kernel and user level) is small and wire time is negligible. This research was supported in part by the National Science Foundation under Grants CCR-9116343, CCR-9211004, CDA-9222911, and CDA-9310073, by the Texas Advanced Technology Program under Grant 003604014, and by a NASA Graduate Fellowship.

[1]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[2]  J. Ott,et al.  Strategies for multilocus linkage analysis in humans. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Nicholas Carriero,et al.  Linda and Friends , 1986, Computer.

[4]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1986, PODC '86.

[5]  David L. Black,et al.  The duality of memory and communication in the implementation of a multiprocessor operating system , 1987, SOSP '87.

[6]  David L. Black,et al.  The mach exception handling facility , 1988, PADD '88.

[7]  Henri E. Bal,et al.  A Distributed Implementation of the Shared Data-object Model , 1989 .

[8]  Kai Li,et al.  A Hypercube Shared Virtual Memory System , 1989, ICPP.

[9]  Brett D. Fleisch,et al.  Mirage: a coherent distributed shared memory design , 1989, SOSP '89.

[10]  Vaidy S. Sunderam,et al.  PVM: A Framework for Parallel Distributed Computing , 1990, Concurr. Pract. Exp..

[11]  M. Hill,et al.  Weak ordering-a new definition , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[12]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[13]  Anoop Gupta,et al.  The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.

[14]  The directory-based cache coherence protocol for the DASH multiprocessor , 1990 .

[15]  Willy Zwaenepoel,et al.  Implementation and performance of Munin , 1991, SOSP '91.

[16]  Ray Bryant,et al.  Supporting Structured Shared Virtual Memory Under Mach , 1991, USENIX MACH Symposium.

[17]  Bill Nitzberg,et al.  Distributed shared memory: a survey of issues and algorithms , 1991, Computer.

[18]  Umakishore Ramachandran,et al.  An implementation of distributed shared memory , 1991, Softw. Pract. Exp..

[19]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[20]  Alan L. Cox,et al.  Lazy Release Consistency for Software Distributed Shared Memory , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[21]  Alan L. Cox,et al.  Lazy release consistency for software distributed shared memory , 1992, ISCA '92.

[22]  M. L. Blount,et al.  DSVM6K: distributed shared virtual memory on the RISC System/6000 , 1993, Digest of Papers. Compcon Spring.

[23]  Alan L. Cox,et al.  Evaluation of release consistent software distributed shared memory on emerging network technology , 1993, ISCA '93.

[24]  Brian N. Bershad,et al.  The Midway distributed shared memory system , 1993, Digest of Papers. Compcon Spring.

[25]  A A Schäffer,et al.  Parallelization of general-linkage analysis problems. , 1994, Human heredity.