A practical building block for solving agreement problems in asynchronous distributed systems

Providing processes with the same view of a global state or allowing them to take consistent decisions, despite asynchrony and failure occurrences, are fundamental problems encountered in distributed systems. These problems are called agreement problems. Non blocking atomic commitment and definition of a single delivery order for broadcast messages are examples of such problems. We define a paradigm (called Single Global View) that encompasses various practical agreement problems. The interest of this paradigm lies in its practicability: each process starts with an initial value, and all these values are pieced together in such a way that, despite process crashes and asynchrony, all correct processes are delivered the same set of values (namely, the Single Global View). The power of this paradigm is the same as that of the consensus problem defined by theoreticians. Instantiations of the paradigm, which solve practical agreement problems, are given. A protocol implementing the paradigm is also presented.

[1]  Leslie Lamport,et al.  Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[2]  Leslie Lamport,et al.  Using Time Instead of Timeout for Fault-Tolerant Distributed Systems. , 1984, TOPL.

[3]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[4]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1996, JACM.

[5]  Rachid Guerraoui,et al.  Consensus service: a modular approach for building agreement protocols in distributed systems , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[6]  Flaviu Cristian,et al.  Atomic Broadcast: From Simple Message Diffusion to Byzantine Agreement , 1995, Inf. Comput..

[7]  Makoto Takizawa,et al.  General Protocols for Consensus in Distributed Systems , 1995, DEXA.

[8]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[9]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[10]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[11]  Flaviu Cristian,et al.  Fail-awareness in timed asynchronous systems , 1996, PODC '96.

[12]  Michel Raynal Real-time dependable decisions in timed asynchronous distributed systems , 1997, Proceedings Third International Workshop on Object-Oriented Real-Time Dependable Systems.

[13]  Danny Dolev,et al.  On the minimal synchronism needed for distributed consensus , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[14]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[15]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[16]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[17]  Frédéric Tronel,et al.  A solution to atomic commitment based on an extended consensus protocol , 1997, Proceedings of the Sixth IEEE Computer Society Workshop on Future Trends of Distributed Computing Systems.