Optimal and Practical WAB-Based Consensus Algorithms

In this paper we introduce two new WAB-based consensus algorithms for the crash-recovery model. The first one, B*-Consensus, is resilient to up to f < n/2 permanent faults, and can solve consensus in three communication steps. R*-Consensus, our second algorithm, is f < n/3 resilient, and can solve consensus in two communication steps. These algorithms are optimal with respect to the time complexity versus resilience tradeoff. We compare our algorithms to other consensus algorithms in the crash-recovery model.

[1]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[2]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[3]  Andrea Bondavalli,et al.  Dependable Computing EDCC-4 , 2002, Lecture Notes in Computer Science.

[4]  Leslie Lamport,et al.  Lower bounds for asynchronous consensus , 2006, Distributed Computing.

[5]  Roy Friedman,et al.  Failure detectors in omission failure environments , 1997, PODC '97.

[6]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[7]  Michael O. Rabin,et al.  Randomized byzantine generals , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[8]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[9]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[10]  André Schiper,et al.  Uniform consensus is harder than consensus , 2004, J. Algorithms.

[11]  Rachid Guerraoui,et al.  Fast Indulgent Consensus with Zero Degradation , 2002, EDCC.

[12]  Danny Dolev,et al.  On the minimal synchronism needed for distributed consensus , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[13]  Michael Ben-Or,et al.  Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.

[14]  Barbara Liskov,et al.  Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems , 1999, PODC '88.

[15]  Idit Keidar,et al.  On the cost of fault-tolerant consensus when there are no faults: preliminary version , 2001, SIGA.

[16]  Achour Mostéfaoui,et al.  Consensus in asynchronous systems where processes can crash and recover , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).

[17]  Marcos K. Aguilera,et al.  Stable Leader Election , 2001, DISC.

[18]  Marcos K. Aguilera,et al.  Failure detection and consensus in the crash-recovery model , 2000, Distributed Computing.

[19]  Péter Urbán,et al.  Solving Agreement Problems with Weak Ordering Oracles , 2002, EDCC.

[20]  André Schiper,et al.  Consensus in the Crash-Recover Model , 1997 .

[21]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.