论文信息 - Low cost consensus-based Atomic Broadcast

Low cost consensus-based Atomic Broadcast

Atomic Broadcast (all processes deliver the same set of messages in the same order) is a very powerful communication primitive when one is interested in building fault-tolerant distributed systems. Moreover, it has been shown that Atomic Broadcast and Consensus are equivalent problems in asynchronous distributed systems prone to process crash failures. Hence, several Consensus-based Atomic Broadcast protocols have been designed. This paper introduces a new and particularly efficient Consensus-based Atomic Broadcast protocol. The efficiency is obtained by limiting the use of the Consensus subroutine to the cases where asynchrony and crashes prevent processes from obtaining a simple agreement on the message delivery order. The protocol assumes n>2f (where n is the number of processes and f the maximum number of them that can crash). In the most favorable cases, it requires two communication steps for processes to determine a message batch. In the worst case it requires an additional Consensus execution. It is shown that, when n>3f, the protocol can be simplified. It then requires a single communication step in the most favorable cases. This exhibits an interesting tradeoff relating the cost of the protocol with the maximum number of process failures.

Achour Mostéfaoui | Michel Raynal

[1] Achour Mostéfaoui,et al. Solving Consensus Using Chandra-Toueg's Unreliable Failure Detectors: A General Quorum-Based Approach , 1999, DISC.

[2] Michel Raynal,et al. A simple and fast asynchronous consensus protocol based on a weak failure detector , 1999, Distributed Computing.

[3] Paulo Veríssimo,et al. Topology-Aware Algorithms for Large-Scale Communication , 1999, Advances in Distributed Systems.

[4] Achour Mostéfaoui,et al. The best of both worlds: A hybrid approach to solve consensus , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[5] Sam Toueg,et al. The weakest failure detector for solving consensus , 1992, PODC '92.

[6] Michael Ben-Or,et al. Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.

[7] David Powell,et al. Group communication , 1996, CACM.

[8] Achour Mostéfaoui,et al. Fault-tolerant Total Order Multicast to asynchronous groups , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).

[9] André Schiper,et al. Optimistic Atomic Broadcast , 1998, DISC.

[10] Nancy A. Lynch,et al. Impossibility of distributed consensus with one faulty process , 1985, JACM.

[11] Sam Toueg,et al. Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[12] Seif Haridi,et al. Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[13] Marcos K. Aguilera,et al. Failure Detection and Randomization: A Hybrid Approach to Solve Consensus , 1998, SIAM J. Comput..

[14] Michael Ben-Or,et al. Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.

[15] Michel Raynal,et al. Restricted failure detectors: Definition and reduction protocols , 1999, Inf. Process. Lett..

[16] Paul D. Ezhilchelvan,et al. Randomized multivalued consensus , 2001, Fourth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing. ISORC 2001.