论文信息 - Scalable atomic multicast

Scalable atomic multicast

We present a new scalable fault-tolerant algorithm which ensures total order delivery of messages sent to multiple groups of processes. Our algorithm is particularly well suited for large scale systems because: (1) any process can multicast a message to one or more groups of processes without being forced to join those groups; (2) inter-group total order is ensured system-wide but, for each individual multicast, the number and size of messages exchanged depends only on the number of addressees; (3) process failure detection does not need to be reliable. Our algorithm also exhibits a modular design. It uses two companion protocols, namely a reliable multicast protocol and a consensus protocol, and these protocols are not required to use the same communication channels or to share common variables with the total order protocol. This approach follows a design methodology based on the composition of (encapsulated) micro-protocols.

[1] Leslie Lamport,et al. Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[2] Nancy A. Lynch,et al. Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[3] Jo-Mei Chang,et al. Reliable broadcast protocols , 1984, TOCS.

[4] Determining the last process to fail , 1985, TOCS.

[5] Kenneth P. Birman,et al. Reliable communication in the presence of failures , 1987, TOCS.

[6] Nancy A. Lynch,et al. Consensus in the presence of partial synchrony , 1988, JACM.

[7] Henri E. Bal,et al. An efficient reliable broadcast protocol , 1989, OPSR.

[8] Hector Garcia-Molina,et al. Ordered and reliable multicast communication , 1991, TOCS.

[9] André Schiper,et al. Lightweight causal and atomic group multicast , 1991, TOCS.

[10] Paulo Veríssimo,et al. xAMp: a multi-primitive group communications service , 1992, [1992] Proceedings 11th Symposium on Reliable Distributed Systems.

[11] Danny Dolev,et al. Early delivery totally ordered multicast in asynchronous environments , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[12] Sam Toueg,et al. Fault-tolerant broadcasts and related problems , 1993 .

[13] Matti A. Hiltunen,et al. An approach to constructing modular fault-tolerant protocols , 1993, Proceedings of 1993 IEEE 12th Symposium on Reliable Distributed Systems.

[14] Louise E. Moser,et al. Extended virtual synchrony , 1994, 14th International Conference on Distributed Computing Systems.

[15] Xiaohua Jia. A Total Ordering Multicast Protocol Using Propagation Trees , 1995, IEEE Trans. Parallel Distributed Syst..

[16] Louise E. Moser,et al. The Totem single-ring ordering and membership protocol , 1995, TOCS.

[17] Newtop: a fault-tolerant group communication protocol , 1995, Proceedings of 15th International Conference on Distributed Computing Systems.

[18] Rachid Guerraoui,et al. Fault-Tolerance by Replication in Distributed Systems , 1996, Ada-Europe.

[19] Sam Toueg,et al. Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[20] Paulo Veríssimo,et al. Totally ordered multicast in large-scale systems , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[21] Rachid Guerraoui,et al. Total order multicast to multiple groups , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[22] Achour Mostéfaoui,et al. Fault-tolerant Total Order Multicast to asynchronous groups , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).

[23] R. Guerraoui,et al. Genuine Atomic Multicast in Asynchronous Systems , 1998 .