Sacrificing serializability to attain high availability of data in an unreliable network

We present a simple algorithm for maintaining a replicated distributed dictionary which achieves high availability of data, rapid processing of atomic actions, efficient utilization of storage, and tolerance to node or network failures including lost or duplicated messages. It does not require transaction logs, synchronized clocks, or other complicated mechanisms for its operation. It achieves consistency contraints which are considerably weaker than serial consistency but nonetheless are adequate for many dictionary applications such as electronic appointment calendars and mail systems. The degree of consistency achieved depends on the particular history of operation of the system in a way that is intuitive and easily understood. The algorithm implements a "best effort" approximation to full serial consistency, relative to whatever internode communication has successfully taken place, so the semantics are fully specified even under partial failure of the system. Both the correctness of the algorithm and the utility of such weak semantics depend heavily on special properties of the dictionary operations.

[1]  H. T. Kung,et al.  An optimality theory of concurrency control for databases , 1979, SIGMOD '79.

[2]  Hector Garcia-Molina,et al.  Read-only transactions in a distributed database , 1982, TODS.

[3]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[4]  Eugene Wong,et al.  Introduction to a system for distributed databases (SDD-1) , 1980, TODS.

[5]  Greg Thiel,et al.  LOCUS a network transparent, high reliability distributed system , 1981, SOSP.

[6]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[7]  Robert H. Thomas,et al.  A Majority consensus approach to concurrency control for multiple copy databases , 1979, ACM Trans. Database Syst..

[8]  Leslie Lamport Towards a theory of correctness of multi-user database systems , 1976 .

[9]  Philip A. Bernstein,et al.  Formal Aspects of Serializability in Database Concurrency Control , 1979, IEEE Transactions on Software Engineering.

[10]  Irving L. Traiger,et al.  The notions of consistency and predicate locks in a database system , 1976, CACM.

[11]  D. J. ROSENKRANTZ,et al.  Consistency and Serializability in Concurrent Database Systems , 1984, SIAM J. Comput..

[12]  Christos H. Papadimitriou,et al.  The Concurrency Control Mechanism of SDD-1: A System for Distributed Databases (The Fully Redundant Case) , 1978, IEEE Transactions on Software Engineering.

[13]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[14]  Philip A. Bernstein,et al.  Concurrency control in a system for distributed databases (SDD-1) , 1980, TODS.

[15]  Christos H. Papadimitriou,et al.  The serializability of concurrent database updates , 1979, JACM.

[16]  Butler W. Lampson,et al.  Crash Recovery in a Distributed Data Storage System , 1981 .

[17]  Michael Hammer,et al.  Reliability mechanisms for SDD-1: a system for distributed databases , 1980, TODS.

[18]  Leslie Lamport,et al.  The Implementation of Reliable Distributed Multiprocess Systems , 1978, Comput. Networks.