Synchronous system and perfect failure detector: Solvability and efficiency issues

We compare, in terms of solvability and efficiency, the synchronous model, noted Ss, with the asynchronous model augmented with a perfect failure detector, noted S/sub P/. We first exhibit a problem that, although time-free, is solvable in S/sub S/ but not in S/sub P/. We then examine whether one of these two models allows more efficient solutions for designing fault-tolerant applications. In particular, we concentrate on the uniform consensus problem which is solvable in both models, and we design a uniform consensus algorithm for the S/sub S/ model that is more efficient than any algorithm solving uniform consensus in S/sub P/ with respect to some significant time complexity measure. From a practical viewpoint, the synchronous model thus seems better than the asynchronous model augmented with a perfect failure detector.

[1]  André Schiper Early consensus in an asynchronous system with a weak failure detector , 1997, Distributed Computing.

[2]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[3]  Sam Toueg,et al.  A Modular Approach to Fault-Tolerant Broadcasts and Related Problems , 1994 .

[4]  Marcos K. Aguilera,et al.  Failure detection and consensus in the crash-recovery model , 1998, Distributed Computing.

[5]  Sam Toueg,et al.  Unreliable Failure Detectors for Asynchronous Systems , 1991 .

[6]  Flaviu Cristian,et al.  The Timed Asynchronous Distributed System Model , 1999, IEEE Trans. Parallel Distributed Syst..

[7]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[8]  Frank B. Schmuck,et al.  Agreeing on Processor Group Membership in Timed Asynchronous Distributed Systems , 1995 .

[9]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[10]  Gil Neiger,et al.  Simulating synchronized clocks and common knowledge in distributed systems , 1993, JACM.

[11]  Rachid Guerraoui,et al.  Consensus service: a modular approach for building agreement protocols in distributed systems , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[12]  Kenneth P. Birman,et al.  Reliable communication in the presence of failures , 1987, TOCS.

[13]  Marcos K. Aguilera,et al.  Using the Heartbeat Failure Detector for Quiescent Reliable Communication and Consensus in Partitionable Networks , 1999, Theor. Comput. Sci..

[14]  Bernadette Charron-Bost,et al.  Revisiting Safety and Liveness in the Context of Failures , 2000, CONCUR.

[15]  Danny Dolev,et al.  On the minimal synchronism needed for distributed consensus , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[16]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[17]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.