Collision-Fast Atomic Broadcast

Atomic Broadcast, an important abstraction in dependable distributed computing, is usually implemented by solving infinitely many instances of the well-known consensus problem. Some asynchronous consensus algorithms achieve the optimal latency of two (message) steps but cannot guarantee this latency even in good runs, those with timely message delivery and no crashes. This is due to collisions, a result of concurrent proposals. Collision-fast consensus algorithms, which decide within two steps in good runs, exist under certain conditions. Their direct application to solving atomic broadcast, though, does not guarantee delivery in two steps for all messages unless a single failure is tolerated. We show a simple way to build a fault-tolerant collision-fast Atomic Broadcast algorithm based on a variation of the consensus problem we call M-Consensus. Our solution to M-Consensus extends the Paxos protocol to allow multiple processes, instead of the single leader, to have their proposals learned in two steps.

[1]  Sam Toueg,et al.  Fault-tolerant broadcasts and related problems , 1993 .

[2]  Leslie Lamport,et al.  Fast Paxos , 2006, Distributed Computing.

[3]  Leslie Lamport,et al.  Lower bounds for asynchronous consensus , 2006, Distributed Computing.

[4]  Achour Mostéfaoui,et al.  Computing Global Functions in Asynchronous Distributed Systems with Perfect Failure Detectors , 2000, IEEE Trans. Parallel Distributed Syst..

[5]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[6]  Keith Marzullo,et al.  Mencius: Building Efficient Replicated State Machine for WANs , 2008, OSDI.

[7]  Leslie Lamport Lower bounds for asynchronous consensus , 2003 .

[8]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[9]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[10]  Martín Abadi,et al.  The existence of refinement mappings , 1988, [1988] Proceedings. Third Annual Information Symposium on Logic in Computer Science.

[11]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[12]  André Schiper,et al.  Handling message semantics with Generic Broadcast protocols , 2002, Distributed Computing.

[13]  Luís E. T. Rodrigues,et al.  An indulgent uniform total order algorithm with optimistic delivery , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[14]  André Schiper,et al.  Optimistic atomic broadcast: a pragmatic viewpoint , 2003, Theor. Comput. Sci..

[15]  Leslie Lamport,et al.  Consensus on transaction commit , 2004, TODS.

[16]  Piotr Zielinski Low-latency atomic broadcast in the presence of contention , 2007, Distributed Computing.

[17]  Leslie Lamport,et al.  Generalized Consensus and Paxos , 2005 .

[18]  B SchneiderFred Implementing fault-tolerant services using the state machine approach: a tutorial , 1990 .

[19]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[20]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[21]  Rachid Guerraoui Revistiting the Relationship Between Non-Blocking Atomic Commitment and Consensus , 1995, WDAG.

[22]  Leslie Lamport,et al.  Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers [Book Review] , 2002, Computer.