Integrating coherency and recoverability in distributed systems

We propose a technique for maintaining coherency of a transactional distributed shared memory, used by applications accessing a shared persistent store. Our goal is to improve support for fine-grained distributed data sharing in collaborative design applications, such as CAD systems and software development environments. In contrast, traditional research in distributed shared memory has focused on supporting parallel programs; in this paper, we show how distributed programs can benefit from this shared-memory abstraction as well. Our approach, called log-based coherency, integrates coherency support with a standard mechanism for ensuring recoverability of persistent data. In our system, transaction logs are the basis of both recoverability and coherency. We have prototyped log-based coherency as a set of extensions to RVM [Satyanarayanan et al. 94], a runtime package supporting recoverable virtual memory. Our prototype adds coherency support to RVM in a simple way that does not require changes to existing RVM applications. We report on our prototype and its performance, and discuss its relationship to other DSM systems.

[1]  Mukesh Singhal,et al.  Using logging and asynchronous checkpointing to implement recoverable distributed shared memory , 1993, Proceedings of 1993 IEEE 12th Symposium on Reliable Distributed Systems.

[2]  Michael D. Schroeder,et al.  Automatic reconfiguration in Autonet , 1991, SOSP '91.

[3]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1986, PODC '86.

[4]  Jack A. Orenstein,et al.  The ObjectStore database system , 1991, CACM.

[5]  W. Kent Fuchs,et al.  Relaxing consistency in recoverable distributed shared memory , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[6]  Miguel Castro,et al.  A checkpoint protocol for an entry consistent shared memory system , 1994, PODC '94.

[7]  Barbara Liskov,et al.  Distributed programming in Argus , 1988, CACM.

[8]  Alan L. Cox,et al.  TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.

[9]  Henry M. Levy,et al.  Distributed shared memory with versioned objects , 1992, OOPSLA.

[10]  Frank Mueller,et al.  A Library Implementation of POSIX Threads under UNIX , 1993, USENIX Winter.

[11]  Ronald Morrison,et al.  Persistent object management system , 1984, Softw. Pract. Exp..

[12]  David J. DeWitt,et al.  Shoring up persistent applications , 1994, SIGMOD '94.

[13]  Robert B. Hagmann A Crash Recovery Scheme for a Memory-Resident Database System , 1986, IEEE Transactions on Computers.

[14]  O. Deux,et al.  The O2 system , 1991 .

[15]  Antony L. Hosking,et al.  Protection traps and alternatives for memory management of an object-oriented language , 1994, SOSP '93.

[16]  Kun-Lung Wu,et al.  Recoverable Distributed Shared Virtual Memory , 1990, IEEE Trans. Computers.

[17]  Michael Williams,et al.  Replication in the harp file system , 1991, SOSP '91.

[18]  Alan L. Cox,et al.  Lazy Release Consistency for Software Distributed Shared Memory , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[19]  Stanley B. Zdonik,et al.  A shared, segmented memory system for an object-oriented database , 1987, TOIS.

[20]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[21]  David J. DeWitt,et al.  The 007 Benchmark , 1993, SIGMOD '93.

[22]  Brian N. Bershad,et al.  Software write detection for a distributed shared memory , 1994, OSDI '94.

[23]  Nick Roussopoulos,et al.  Performance and Scalability of Client-Server Database Architectures , 1992, VLDB.

[24]  Jeffrey F. Naughton,et al.  Multiprocessor Main Memory Transaction Processing , 1988, Proceedings [1988] International Symposium on Databases in Parallel and Distributed Systems.

[25]  Alan L. Cox,et al.  Lazy release consistency for software distributed shared memory , 1992, ISCA '92.

[26]  Willy Zwaenepoel,et al.  Implementation and performance of Munin , 1991, SOSP '91.

[27]  John Rosenberg,et al.  MONADS-PC - a capability-based workstation to support software engineering , 1985 .

[28]  Mahadev Satyanarayanan,et al.  Lightweight recoverable virtual memory , 1993, SOSP '93.

[29]  J. Eliot B. Moss,et al.  Design of the Mneme persistent object store , 1990, TOIS.

[30]  David K. Gifford,et al.  Concurrent compacting garbage collection of a persistent heap , 1993, SOSP '93.

[31]  Henry M. Levy,et al.  Hardware and software support for efficient exception handling , 1994, ASPLOS VI.

[32]  O. Deux,et al.  The O2 system , 1991 .

[33]  Michael Stumm,et al.  Fault tolerant distributed shared memory algorithms , 1990, Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing 1990.

[34]  Jacob Stein,et al.  The GemStone object database management system , 1991, CACM.