Mutual exclusion in partitioned distributed systems

A network partition can break a distributed computing system into groups of isolated nodes. When this occurs, a mutual exclusion mechanism may be required to ensure that isolated groups do not concurrently perform conflicting operations. We study and formalize these mechanisms in three basic scenarios: where there is a single conflicting type of action; where there are two conflicting types, but operations of the same type do not conflict; and where there are two conflicting types, but operations of one type do not conflict among themselves. For each scenario, we present applications that require mutual exclusion (e.g., name servers, termination protocols, concurrency control). In each case, we also present mutual exclusion mechanisms that are more general and that may provide higher reliability than the voting mechanisms that have been proposed as solutions to this problem.

[1]  Warren Smith,et al.  An Evaluation Method for Analysis of the Weighted Voting Algorithm for Maintaining Replicated Data , 1984, ICDCS.

[2]  Michael Stonebraker,et al.  A Formal Model of Crash Recovery in a Distributed System , 1983, IEEE Transactions on Software Engineering.

[3]  Hector Garcia-Molina,et al.  Optimizing the Reliability Provided by Voting Mechanisms , 1984, ICDCS.

[4]  Robert H. Thomas,et al.  A Majority consensus approach to concurrency control for multiple copy databases , 1979, ACM Trans. Database Syst..

[5]  Dale Skeen,et al.  A Quorum-Based Commit Protocol , 1982, Berkeley Workshop.

[6]  Nancy A. Lynch,et al.  Simple and efficient Byzantine generals algorithm , 1982 .

[7]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[8]  Michael Hammer,et al.  Reliability mechanisms for SDD-1: a system for distributed databases , 1980, TODS.

[9]  Leslie Lamport,et al.  The Implementation of Reliable Distributed Multiprocess Systems , 1978, Comput. Networks.

[10]  Hector Garcia-Molina,et al.  The Vulnerability of Voting Mechanisms , 1984, Symposium on Reliability in Distributed Software and Database Systems.

[11]  Hector Garcia-Molina,et al.  Reliability Issues for Fully Replicated Databases. , 1982 .

[12]  Philip A. Bernstein,et al.  The Concurrency Control Mechanism of SDD-1: A System for Distributed Databases (The General Case) , 1977 .

[13]  Hector Garcia-Molina,et al.  Reliability issues for fully replicated distributed databases , 1982, Computer.

[14]  Eric C. Cooper Analysis of distributed commit protocols , 1982, SIGMOD '82.

[15]  H ThomasRobert A Majority consensus approach to concurrency control for multiple copy databases , 1979 .

[16]  Danny Dolev,et al.  Polynomial algorithms for multiple processor agreement , 1982, STOC '82.

[17]  Philip A. Bernstein,et al.  Concurrency Control in Distributed Database Systems , 1986, CSUR.

[18]  Christos H. Papadimitriou,et al.  The Concurrency Control Mechanism of SDD-1: A System for Distributed Databases (The Fully Redundant Case) , 1978, IEEE Transactions on Software Engineering.

[19]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[20]  Hector Garcia-Molina,et al.  How to assign votes in a distributed system , 1985, JACM.