From crash fault-tolerance to arbitrary-fault tolerance: towards a modular approach

Presents a generic methodology to transform a protocol which is resilient to process crashes into one that is resilient to arbitrary failures in the case where processes run the same text and regularly exchange messages (i.e. the case of round-based protocols). The methodology follows a modular approach, encapsulating the detection of arbitrary failures in specific modules. This can be the starting point for designing tools that allow automatic transformation. We show an application of this methodology to the case of consensus.

[1]  Rachid Guerraoui,et al.  Muteness Failure Detectors: Specification and Implementation , 1999, EDCC.

[2]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[3]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[4]  Louise E. Moser,et al.  Solving Consensus in a Byzantine Environment Using an Unreliable Fault Detector , 1997, OPODIS.

[5]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[6]  Michel Raynal,et al.  A simple and fast asynchronous consensus protocol based on a weak failure detector , 1999, Distributed Computing.

[7]  A. Doudou,et al.  Muteness Failure Detectors for Consensus with Byzantine Processes , 1997 .

[8]  A. Doudou,et al.  Muteness Detectors for Consensus with Byzantine Processes (Brief Announcement) , 1998, PODC 1998.

[9]  Roy Friedman,et al.  Failure detectors in omission failure environments , 1997, PODC '97.

[10]  David Powell,et al.  Failure mode assumptions and assumption coverage , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.

[11]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[12]  Michael K. Reiter,et al.  Unreliable intrusion detection in distributed computations , 1997, Proceedings 10th Computer Security Foundations Workshop.

[13]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.