Indulgent algorithms (preliminary version)

Informally, an <italic>indulgent</italic> algorithm is a distributed algorithm that tolerates <italic>unreliable</italic> failure detection: the algorithm is <italic>indulgent</italic> towards its failure detector. This paper formally characterises such algorithms and states some of their interesting features. We show that indulgent algorithms are inherently <italic>safe</italic> and <italic>uniform</italic>. We also state impossibility results for indulgent solutions to <italic>divergent</italic> problems like consensus, and <italic>failure-sensitive</italic> problems like non-blocking atomic commit and terminating reliable broadcast.

[1]  Dale Skeen,et al.  Nonblocking commit protocols , 1981, SIGMOD '81.

[2]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[3]  Fred B. Schneider Decomposing Properties into Safety and Liveness , 1987 .

[4]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[5]  Rida A. Bazzi,et al.  Simulating Crash Failures with Many Faulty Processors (Extended Abstract) , 1992, WDAG.

[6]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[7]  Fred B. Schneider,et al.  Replication management using the state-machine approach , 1993 .

[8]  Sam Toueg,et al.  Fault-tolerant broadcasts and related problems , 1993 .

[9]  Sam Toueg,et al.  A Modular Approach to Fault-Tolerant Broadcasts and Related Problems , 1994 .

[10]  Rachid Guerraoui Revistiting the Relationship Between Non-Blocking Atomic Commitment and Consensus , 1995, WDAG.

[11]  Keith Marzullo,et al.  Election Vs. Consensus in Asynchronous Systems , 1995 .

[12]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[13]  André Schiper Early consensus in an asynchronous system with a weak failure detector , 1997, Distributed Computing.

[14]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[15]  Michel Raynal,et al.  A simple and fast asynchronous consensus protocol based on a weak failure detector , 1999, Distributed Computing.

[16]  Barbara Liskov,et al.  Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems , 1999, PODC '88.

[17]  Nancy A. Lynch,et al.  Revisiting the PAXOS algorithm , 1997, Theor. Comput. Sci..