The BG distributed simulation algorithm

Abstract. We present a shared memory algorithm that allows a set of f+1 processes to wait-free “simulate” a larger system of n processes, that may also exhibit up to f stopping failures.Applying this simulation algorithm to the k-set-agreement problem enables conversion of an arbitrary k-fault-tolerant{\it n}-process solution for the k-set-agreement problem into a wait-free k+1-process solution for the same problem. Since the k+1-processk-set-agreement problem has been shown to have no wait-free solution [5,18,26], this transformation implies that there is no k-fault-tolerant solution to the n-process k-set-agreement problem, for any n.More generally, the algorithm satisfies the requirements of a fault-tolerant distributed simulation.\/ The distributed simulation implements a notion of fault-tolerant reducibility\/ between decision problems. This paper defines these notions and gives examples of their application to fundamental distributed computing problems.The algorithm is presented and verified in terms of I/O automata. The presentation has a great deal of interesting modularity, expressed by I/O automaton composition and both forward and backward simulation relations. Composition is used to include a safe agreement\/ module as a subroutine. Forward and backward simulation relations are used to view the algorithm as implementing a multi-try snapshot\/ strategy.The main algorithm works in snapshot shared memory systems; a simple modification of the algorithm that works in read/write shared memory systems is also presented.

[1]  F. Vaandrager Forward and Backward Simulations Part I : Untimed Systems , 1993 .

[2]  Yehuda Afek,et al.  Synchronization power depends on the register size , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[3]  Soma Chaudhuri,et al.  More Choices Allow More Faults: Set Consensus Problems in Totally Asynchronous Systems , 1993, Inf. Comput..

[4]  Nancy A. Lynch,et al.  Forward and Backward Simulations: I. Untimed Systems , 1995, Inf. Comput..

[5]  Maurice Herlihy,et al.  On the decidability of distributed decision tasks , 1996, PODC '96.

[6]  Eli Gafni,et al.  Generalized FLP impossibility result for t-resilient asynchronous computations , 1993, STOC.

[7]  Sam Toueg,et al.  Wait-freedom vs. t-resiliency and the robustness of wait-free hierarchies (extended abstract) , 1994, PODC '94.

[8]  Elizabeth Borowsky,et al.  Capturing the power of resiliency and set consensus in distributed systems , 1996 .

[9]  Hagit Attiya,et al.  Renaming in an asynchronous environment , 1990, JACM.

[10]  Maurice Herlihy,et al.  The asynchronous computability theorem for t-resilient tasks , 1993, STOC.

[11]  Eli Gafni,et al.  3-processor tasks are undecidable , 1995, PODC '95.

[12]  Soma Chaudhuri,et al.  Understanding the Set Consensus Partial Order Using the Borowsky-Gafni Simulation (Extended Abstract) , 1996, WDAG.

[13]  Nancy A. Lynch,et al.  On the Borowsky-Gafni simulation algorithm , 1996, PODC '96.

[14]  Vassos Hadzilacos,et al.  On the power of shared object types to implement one-resilient Consensus , 1997, PODC '97.

[15]  Yehuda Afek,et al.  Synchronization power depends on the register size (Preliminary Version) , 1993, FOCS 1993.

[16]  Shmuel Zaks,et al.  A Combinatorial Characterization of the Distributed 1-Solvable Tasks , 1990, J. Algorithms.

[17]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[18]  Maurice Herlihy,et al.  On the existence of booster types , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[19]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[20]  Michael E. Saks,et al.  Wait-free k-set agreement is impossible: the topology of public knowledge , 1993, STOC.

[21]  Nancy A. Lynch,et al.  An introduction to input/output automata , 1989 .

[22]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[23]  Vassos Hadzilacos,et al.  On the power of shared object types to implement one-resilient Consensus , 1997, PODC '97.