A fully non-blocking reliable multicast protocol with total ordering

We present an efficient protocol for reliable multicast in an asynchronous network subject to link and process failures. Our protocol preserves total ordering in the sense that as processes or communication links become faulty, each group of non-faulty processes that remain connected will agree on the same sequence of messages delivered. Even processes that get disconnected deliver messages in a consistent order, i.e. message delivery is globally consistent. Although protocols that achieve reliable multicast with total ordering are known in the literature, these protocols will block the delivery of certain messages when the processes or links become faulty until membership assent is reached with the non-faulty processes. In contrast, our protocol is fully non-blocking in the sense that all messages will continue to be delivered by non-faulty processes despite a process failure or a change in membership of the multicast group. Moreover, our protocol differs from prior works in that it does not assume the existence of an underlying layer that detects link or process failures.

[1]  Danny Dolev,et al.  Early delivery totally ordered multicast in asynchronous environments , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[2]  André Schiper,et al.  Lightweight causal and atomic group multicast , 1991, TOCS.

[3]  Kenneth P. Birman,et al.  Reliable communication in the presence of failures , 1987, TOCS.

[4]  Özalp Babaoglu,et al.  RELACS: A communications infrastructure for constructing reliable applications in large-scale distributed systems , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[5]  Yair Amir,et al.  Membership Algorithms for Multicast Communication Groups , 1992, WDAG.

[6]  LamportLeslie Time, clocks, and the ordering of events in a distributed system , 1978 .

[7]  Louise E. Moser,et al.  Extended virtual synchrony , 1994, 14th International Conference on Distributed Computing Systems.

[8]  Jo-Mei Chang,et al.  Reliable broadcast protocols , 1984, TOCS.

[9]  Richard D. Schlichting,et al.  Preserving and using context information in interprocess communication , 1989, TOCS.