Consistent global states of distributed systems: fundamental concepts and mechanisms

Distributed systems that span large geographic distances or interconnect large numbers of components are adequately modeled as asynchronous systems. Given the uncertainties in such systems that arise from communication delays and relative speeds of computations, reasoning about global states has to be carried out using local, and often, imperfect information. In this paper, we consider global predicate evaluation as a canonical problem in order to survey concepts and mechanisms that are useful in coping with uncertainty in distributed computation. We illustrate the utility of the developed techniques by examining distributed deadlock detection and distributed debugging as two instances of global predicate evaluation.

[1]  Michel Raynal,et al.  Detection of stable properties in distributed applications , 1987, PODC '87.

[2]  Friedemann Mattern,et al.  Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation , 1993, J. Parallel Distributed Comput..

[3]  Kim Taylor The Role of Inhibition on Asynchronous Consistent-Cut Protocols , 1989, WDAG.

[4]  Gil Neiger,et al.  Substituting for real time and common knowledge in asynchronous distributed systems , 1987, PODC '87.

[5]  Colin J. Fidge,et al.  Timestamps in Message-Passing Systems That Preserve the Partial Ordering , 1988 .

[6]  Amir Pnueli,et al.  On the Development of Reactive Systems , 1989, Logics and Models of Concurrent Systems.

[7]  Andrew S. Tanenbaum,et al.  Group communication in the Amoeba distributed operating system , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[8]  Carroll Morgan,et al.  Global and Logical Time in Distributed Algorithms , 1985, Inf. Process. Lett..

[9]  Henry M. Levy,et al.  Modules, objects and distributed programming: Issues in RPC and remote object invocation , 1990 .

[10]  Keith Marzullo,et al.  Detection of Global State Predicates , 1991, WDAG.

[11]  Fred B. Schneider,et al.  Replication management using the state-machine approach , 1993 .

[12]  Hermann Kopetz,et al.  Sparse time versus dense time in distributed real-time systems , 1992, [1992] Proceedings of the 12th International Conference on Distributed Computing Systems.

[13]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1989, TOCS.

[14]  André Schiper,et al.  The Causal Ordering Abstraction and a Simple Way to Implement it , 1991, Inf. Process. Lett..

[15]  Mukesh Singhal,et al.  Deadlock detection in distributed systems , 1989, Computer.

[16]  Kenneth P. Birman,et al.  The process group approach to reliable distributed computing , 1992, CACM.

[17]  Danny Dolev,et al.  On the possibility and impossibility of achieving clock synchronization , 1984, STOC '84.

[18]  Keith Marzullo,et al.  Consistent detection of global predicates , 1991, PADD '91.

[19]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[20]  André Schiper,et al.  A New Algorithm to Implement Causal Ordering , 1989, WDAG.

[21]  Leslie Lamport,et al.  Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[22]  Butler W. Lampson,et al.  Distributed Systems - Architecture and Implementation, An Advanced Course , 1981, Advanced Course: Distributed Systems.

[23]  Andrew Birrell,et al.  Implementing remote procedure calls , 1984, TOCS.

[24]  Michel Raynal,et al.  About logical clocks for distributed systems , 1992, OPSR.

[25]  Butler W. Lampson,et al.  Atomic Transactions , 1980, Advanced Course: Distributed Systems.

[26]  Richard D. Schlichting,et al.  Preserving and using context information in interprocess communication , 1989, TOCS.

[27]  Flaviu Cristian,et al.  Atomic Broadcast: From Simple Message Diffusion to Byzantine Agreement , 1995, Inf. Comput..

[28]  André Schiper,et al.  Lightweight causal and atomic group multicast , 1991, TOCS.

[29]  Baruch Awerbuch,et al.  Complexity of network synchronization , 1985, JACM.

[30]  Jayadev Misra,et al.  Distributed discrete-event simulation , 1986, CSUR.

[31]  P. M. Melliar-Smith,et al.  Synchronizing clocks in the presence of faults , 1985, JACM.