Efficient Support for Multicomputing on ATM Networks

AbstractThe emergence of a new generation of networks will dramatically increase the attractiveness ofloosely-coupled multicomputers based on workstation clusters. The key to achieving high performancein this environment is efficient network access, because thecost of remote access dictates the granularityof parallelism that can be supported. Thus, in addition to traditional distribution mechanisms such asRPC, workstation clusters should support lightweight communication paradigms for executing parallelapplications.This paper describes a simple communication model based on the notion of remote memory access.Applications executing on one host can perform direct memory read or write operations on user-definedremote memory buffers. We have implemented a prototype system based on this model using com-mercially available workstations and ATM networks. Our prototype uses kernel-based emulation ofremote read and write instructions, implemented through unused processor opcodes; thus, applications(or runtime libraries) see direct machine support for remote memory access. We show that this modelcan be supportedsafely and efficiently on current systems; forexample, a 40-byteremote writeoperationcompletes in 30

[1]  Andrew Birrell,et al.  Implementing Remote procedure calls , 1983, SOSP '83.

[2]  Seth Copen Goldstein,et al.  Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[3]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1986, PODC '86.

[4]  Brian N. Bershad,et al.  Lightweight remote procedure call , 1989, TOCS.

[5]  Michael Burrows,et al.  Performance of Firefly RPC , 1990, ACM Trans. Comput. Syst..

[6]  Anoop Gupta,et al.  The DASH prototype: implementation and performance , 1992, ISCA '92.

[7]  J. Ortega,et al.  Solution of Partial Differential Equations on Vector and Parallel Computers , 1987 .

[8]  Donald Yeung,et al.  THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR , 1991 .

[9]  Gary Scott Delp The architecture and implementation of MEMNET: a high--speed shared-memory computer communication network , 1988 .

[10]  Willy Zwaenepoel,et al.  The peregrine high‐performance RPC system , 1993, Softw. Pract. Exp..

[11]  Brian N. Bershad,et al.  User-level interprocess communication for shared memory multiprocessors , 1991, TOCS.

[12]  D.R. Cheriton,et al.  VMTP as the transport layer for high-performance distributed systems , 1989, IEEE Communications Magazine.

[13]  Lixia Zhang,et al.  NETBLT: a high throughput transport protocol , 1987, Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication.

[14]  Willy Zwaenepoel,et al.  Implementation and performance of Munin , 1991, SOSP '91.

[15]  H. T. Kung,et al.  The design of nectar: a network backplane for heterogeneous multicomputers , 1989, ASPLOS III.

[16]  Brian N. Bershad,et al.  The interaction of architecture and operating system design , 1991, ASPLOS IV.

[17]  Brian N. Bershad,et al.  The increasing irrelevance of IPC Performance for Micro-kernel-Based Operating Systems , 1992, USENIX Workshop on Microkernels and Other Kernel Architectures.

[18]  Nancy P. Kronenberg,et al.  VAXclusters: A Closely-Coupled Distributed System (Abstract). , 1985, SOSP 1985.