Managing update conflicts in Bayou, a weakly connected replicated storage system

Bayou is a replicated, weakly consistent storage system designed for a mobile computing environment that includes portable machines with less than ideal network connectivity. To maximize availability, users can read and write any accessible replica. Bayou’s design has focused on supporting application-specific mechanisms to detect and resolve the update conflicts that naturally arise in such a system, ensuring that replicas move towards eventual consistency, and defining a protocol by which the resolution of update conflicts stabilizes. It includes novel methods for conflict detection, called dependency checks, and per -write conflict resolution based on client-provid ed mer ge procedures. To guarantee eventual consistency, Bayou servers must be able to rollback the effects of previously executed writes and redo them according to a global serialization order . Furthermore, Bayou permits clients to observe the results of all writes received by a server , including tentative writes whose conflicts have not been ultimately resolved. This paper presents the motivation for and design of these mechanisms and describes the experiences gained with an initial implementation of the system.

[1]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[2]  Roger M. Needham,et al.  Using encryption for authentication in large networks of computers , 1978, CACM.

[3]  Michael Stonebraker,et al.  Concurrency Control and Consistency of Multiple Copies of Data in Distributed Ingres , 1979, IEEE Transactions on Software Engineering.

[4]  Roger M. Needham,et al.  Grapevine: an exercise in distributed computing , 1982, CACM.

[5]  Bruce Walker,et al.  The LOCUS distributed operating system , 1983, SOSP '83.

[6]  Alley Stoughton,et al.  Detection of Mutual Inconsistency in Distributed Systems , 1983, IEEE Transactions on Software Engineering.

[7]  Leslie Lamport,et al.  Latex : A Document Preparation System , 1985 .

[8]  Hector Garcia-Molina,et al.  Consistency in a partitioned network: a survey , 1985, CSUR.

[9]  Brian A. Coan,et al.  Limitations on database availability when networks partition , 1986, PODC '86.

[10]  Irene Greif,et al.  Replicated document management in a group communication system , 1988, CSCW '88.

[11]  Doug Terry,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[12]  Clarence A. Ellis,et al.  Concurrency control in groupware systems , 1989, SIGMOD '89.

[13]  Mahadev Satyanarayanan,et al.  Coda: A Highly Available File System for a Distributed Workstation Environment , 1990, IEEE Trans. Computers.

[14]  John S. Heidemann,et al.  Implementation of the Ficus Replicated File System , 1990, USENIX Summer.

[15]  Miron Livny,et al.  Conflict detection tradeoffs for replicated data , 1991, TODS.

[16]  André Schiper,et al.  Lightweight causal and atomic group multicast , 1991, TOCS.

[17]  Richard A. Golding A Weak-Consistency Architecture for Distributed Information Services , 1992, Comput. Syst..

[18]  Mahadev Satyanarayanan,et al.  Disconnected operation in the Coda File System , 1992, TOCS.

[19]  Liuba Shrira,et al.  Providing high availability using lazy replication , 1992, TOCS.

[20]  Rafael Alonso,et al.  Database system issues in nomadic computing , 1993, SIGMOD Conference.

[21]  Mahadev Satyanarayanan,et al.  Log-based directory resolution in the Coda file system , 1993, [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems.

[22]  Mahadev Satyanarayanan,et al.  Disconnected Operation in the Coda File System , 1999, Mobidata.

[23]  Prasun Dewan,et al.  A flexible object merging framework , 1994, CSCW '94.

[24]  John K. Ousterhout,et al.  Tcl and the Tk Toolkit , 1994 .

[25]  Tomasz Imielinski,et al.  Mobile wireless computing: challenges in data management , 1994, CACM.

[26]  John S. Heidemann,et al.  Resolving File Conflicts in the Ficus File System , 1994, USENIX Summer.

[27]  Marvin Theimer,et al.  Session guarantees for weakly consistent replicated data , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[28]  Darrell D. E. Long,et al.  The refdbms Distributed Bibliographic Database System , 1994, USENIX Winter.

[29]  Mahadev Satyanarayanan,et al.  Flexible and Safe Resolution of File Conflicts , 1995, USENIX.