论文信息 - Restricted failure detectors: Definition and reduction protocols

Restricted failure detectors: Definition and reduction protocols

This paper investigates unreliable failure detectors with restricted properties, in the context of asynchronous distributed systems made up of n processes where at most f may crash. “Restricted” means that the completeness and the accuracy properties defining a failure detector class are not required to involve all the correct processes but only k and k′ of them, respectively (k are involved in the completeness property, and k′ in the accuracy property). These restricted properties define the classes R(k,k′) and ♢R(k,k′) of unreliable failure detectors. A reduction protocol that transforms a restricted failure detector into its non-restricted counterpart is presented. It is shown that the reduction requires k+k′>n (to be safe) and max(k,k′)≤n−f (to be live). So, when these two conditions are satisfied, R(k,k′) and ♢R(k,k′) are equivalent to the Chandra–Toueg's failure detector classes S and ♢S, respectively. This theoretical transformation is also interesting from a practical point of view because the restricted properties are usually easier to satisfy than their non-restricted counterparts in asynchronous distributed systems.

Michel Raynal | Frédéric Tronel

[1] Nancy A. Lynch,et al. Impossibility of distributed consensus with one faulty process , 1985, JACM.

[2] Sam Toueg,et al. Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[3] Achour Mostéfaoui,et al. Solving Consensus Using Chandra-Toueg's Unreliable Failure Detectors: A General Quorum-Based Approach , 1999, DISC.

[4] Mikel Larrea,et al. Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems , 1999, DISC.

[5] Neil V. Murray,et al. Inference with path resolution and semantic graphs , 1987, JACM.

[6] André Schiper. Early consensus in an asynchronous system with a weak failure detector , 1997, Distributed Computing.

[7] Michel Raynal,et al. A simple and fast asynchronous consensus protocol based on a weak failure detector , 1999, Distributed Computing.