Consensus in the presence of partial synchrony

The concept of partial synchrony in a distributed system is introduced. Partial synchrony lies between the cases of a synchronous system and an asynchronous system. In a synchronous system, there is a known fixed upper bound Δ on the time required for a message to be sent from one processor to another and a known fixed upper bound &PHgr; on the relative speeds of different processors. In an asynchronous system no fixed upper bounds Δ and &PHgr; exist. In one version of partial synchrony, fixed bounds Δ and &PHgr; exist, but they are not known a priori. The problem is to design protocols that work correctly in the partially synchronous system regardless of the actual values of the bounds Δ and &PHgr;. In another version of partial synchrony, the bounds are known, but are only guaranteed to hold starting at some unknown time T, and protocols must be designed to work correctly regardless of when time T occurs. Fault-tolerant consensus protocols are given for various cases of partial synchrony and various fault models. Lower bounds that show in most cases that our protocols are optimal with respect to the number of faults tolerated are also given. Our consensus protocols for partially synchronous processors use new protocols for fault-tolerant “distributed clocks” that allow partially synchronous processors to reach some approximately common notion of time.

[1]  Jim Gray,et al.  Notes on Data Base Operating Systems , 1978, Advanced Course: Operating Systems.

[2]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[3]  Nancy A. Lynch,et al.  An Efficient Algorithm for Byzantine Agreement without Authentication , 1982, Inf. Control..

[4]  Dale Skeen,et al.  A Quorum-Based Commit Protocol , 1982, Berkeley Workshop.

[5]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[6]  Leslie Lamport,et al.  The Weak Byzantine Generals Problem , 1983, JACM.

[7]  Danny Dolev,et al.  Authenticated Algorithms for Byzantine Agreement , 1983, SIAM J. Comput..

[8]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[9]  Michael J. Fischer,et al.  The Consensus Problem in Unreliable Distributed Systems (A Brief Survey) , 1983, FCT.

[10]  Danny Dolev,et al.  On the minimal synchronism needed for distributed consensus , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[11]  Hector Garcia-Molina,et al.  Is byzantine agreement useful in a distributed database? , 1984, PODS '84.

[12]  Hagit Attiya,et al.  Asynchronous Byzantine consensus , 1984, PODC '84.

[13]  Fred B. Schneider,et al.  Byzantine generals in action: implementing fail-stop processors , 1984, TOCS.

[14]  Shlomit Sarah Pinter,et al.  Distributed computation systems: modelling, verification, and algorithms , 1984 .

[15]  Sam Toueg,et al.  Asynchronous consensus and broadcast protocols , 1985, JACM.

[16]  Rüdiger Reischuk,et al.  A New Solution for the Byzantine Generals Problem , 1985, Inf. Control..

[17]  Nancy A. Lynch,et al.  Reaching approximate agreement in the presence of faults , 1986, JACM.

[18]  Yoram Moses,et al.  Knowledge and Common Knowledge in a Byzantine Environment I: Crash Failures , 1986, TARK.