Improving Fast Paxos: being optimistic with no overhead

The paper addresses the cost of consensus algorithms. It has been shown that in the best case, consensus can be solved in two communication steps with f<n/2, and in one communication step with f<n/3 (f is the maximum number of faulty processes). This leads to a dilemma when choosing a consensus algorithm: greater efficiency or higher resiliency degree. Recently Lamport has proposed a solution called Fast Paxos, for partly escaping from this dilemma. The idea is to combine two types of rounds in a single consensus algorithm: fast rounds and rounds of the ordinary Paxos algorithm. In the best case, Fast Paxos solves consensus in one fast round, that is it requires only one communication step. Unfortunately, the combination induces some time overhead, and so Fast Paxos becomes more expensive than ordinary Paxos when fast rounds do not succeed. In this paper we go one step further: we show that it is possible to tentatively execute a fast round before a classical round without any time overhead if the fast round does not succeed

[1]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[2]  Danny Dolev,et al.  On the minimal synchronism needed for distributed consensus , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[3]  Achour Mostéfaoui,et al.  Consensus in One Communication Step , 2001, PaCT.

[4]  Michael O. Rabin,et al.  Randomized byzantine generals , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[5]  Leslie Lamport,et al.  Fast Paxos , 2006, Distributed Computing.

[6]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[7]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[8]  Rachid Guerraoui,et al.  Deconstructing paxos , 2003, SIGA.

[9]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[10]  Marcos K. Aguilera,et al.  Failure Detection and Randomization: A Hybrid Approach to Solve Consensus , 1998, SIAM J. Comput..

[11]  André Schiper,et al.  The Heard-Of Model: Unifying all Benign Failures , 2006 .

[12]  André Schiper Early consensus in an asynchronous system with a weak failure detector , 1997, Distributed Computing.

[13]  Péter Urbán,et al.  Solving Agreement Problems with Weak Ordering Oracles , 2002, EDCC.

[14]  Leslie Lamport,et al.  Lower bounds for asynchronous consensus , 2006, Distributed Computing.

[15]  Eli Gafni,et al.  Round-by-Round Fault Detectors: Unifying Synchrony and Asynchrony (Extended Abstract). , 1998, PODC 1998.

[16]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[17]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[18]  Nicola Santoro,et al.  Time is Not a Healer , 1989, STACS.