The Use of Efficient Broadcast Protocols in Asynchronous Distributed Systems

Reliable broadcast protocols are important tools in distributed and fault-tolerant programming. They are useful for sharing information and for maintaining replicated data in a distributed system. However, a wide range of such protocols has been proposed. These protocols differ in their fault tolerance and delivery ordering characteristics. There is a tradeoff between the cost of a broadcast protocol and how much ordering it provides. It is, therefore, desirable to employ protocols that support only a low degree of ordering whenever possible. This dissertation presents techniques for deciding how strongly ordered a protocol is necessary to solve a given application problem. We show that there are two distinct classes of application problems: problems that can be solved with efficient, asynchronous protocols, and problems that require global ordering. We introduce the concept of a linearization function that maps partially ordered sets of events to totally ordered histories. We show how to construct an asynchronous implementation that solves a given problem if a linearization function for it can be found. We prove that in general the question of whether a problem has an asynchronous solution is undecidable. Hence there exists no general algorithm that would automatically construct a suitable linearization function for a given problem. Therefore, we consider an important subclass of problems that have certain commutativity properties. We present techniques for constructing asynchronous implementations for this class. These techniques are useful for constructing efficient asynchronous implementations for a broad range of practical problems.

[1]  Sam Toueg,et al.  Optimal clock synchronization , 1985, PODC '85.

[2]  Richard D. Schlichting,et al.  Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.

[3]  Jo-Mei Chang,et al.  Reliable broadcast protocols , 1984, TOCS.

[4]  Larry L. Peterson,et al.  Preserving Context Information in an IPC Abstraction , 1987, SRDS.

[5]  P BirmanKenneth,et al.  Low cost management of replicated data in fault-tolerant distributed systems , 1986 .

[6]  Kenneth P. Birman,et al.  Exploiting virtual synchrony in distributed systems , 1987, SOSP '87.

[7]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[8]  Christos H. Papadimitriou,et al.  The serializability of concurrent database updates , 1979, JACM.

[9]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[10]  Barbara Liskov,et al.  Highly available distributed services and fault-tolerant distributed garbage collection , 1986, PODC '86.

[11]  C. H. Papadimitriou SERIALIZABILITY OF CONCURRENT DATA BASE UPDATES , 1979 .

[12]  Kenneth P. Birman,et al.  Reliable communication in the presence of failures , 1987, TOCS.

[13]  Philip A. Bernstein,et al.  Concurrency Control in Distributed Database Systems , 1986, CSUR.

[14]  Kenneth P. Birman,et al.  Programming with Shared Bulletin Boards in Asynchronus Distributed Systems , 1986 .

[15]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[16]  Ozalp Babaoglu,et al.  Almost) No Cost Clock Synchronization , 1986 .

[17]  P. M. Melliar-Smith,et al.  Synchronizing clocks in the presence of faults , 1985, JACM.

[18]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.