The Minimum Failure Detector For Non-Local Tasks In Message-Passing Systems

Intuitively, a task is local if the output value of each process depends only on the process' own input value, not on the input values of the other processes; a task is non-local otherwise. In this paper, we use the failure detector abstraction to determine the minimum information about failures that is necessary to solve non-local tasks in message-passing systems. More precisely, we show that there is a non-trivial failure detector, denoted FS*, that is necessary to solve non-localtasks, i.e., FS* can be extracted from any failure detector that can be used to solve any non-local task in message-passing systems. We also show that FS* is the strongest failure detector with this property. So, intuitively, FS* is the greatest lower bound of the set of failure detectors that solve \dst\ tasks in message-passing systems. Even though FS* is quite weak, it is strong enough to solve a natural weakening of the well-known set agreement task, that we call \emph{weak set agreement. In fact, we show that FS* is the weakest failure detector to solve the weak set agreement task. Finally, we compare FS* to two closely related failure detectors, namely, L and anti-Omega, which are the weakest failure detectors to solve set agreement in message-passing and shared memory systems, respectively. We prove that anti-Omega is strictly weaker than FS* and FS* is strictly weaker than L, in message-passing systems.

[1]  Marcos K. Aguilera,et al.  Revising the Weakest Failure Detector for Uniform Reliable Broadcast , 1999, DISC.

[2]  Soma Chaudhuri,et al.  Agreement is harder than consensus: set consensus problems in totally asynchronous systems , 1990, PODC '90.

[3]  Piotr Zielinski Anti-Ω: the weakest failure detector for set agreement , 2008, PODC '08.

[4]  Rachid Guerraoui,et al.  The Weakest Failure Detector for Message Passing Set-Agreement , 2008, DISC.

[5]  Rachid Guerraoui,et al.  Mutual exclusion in asynchronous systems with failure detectors , 2005, J. Parallel Distributed Comput..

[6]  Michael E. Saks,et al.  Wait-free k-set agreement is impossible: the topology of public knowledge , 1993, STOC.

[7]  Shmuel Zaks,et al.  A Combinatorial Characterization of the Distributed 1-Solvable Tasks , 1990, J. Algorithms.

[8]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[9]  Sam Toueg,et al.  Every problem has a weakest failure detector , 2008, PODC '08.

[10]  Rachid Guerraoui,et al.  The weakest failure detectors to solve certain fundamental problems in distributed computing , 2004, PODC '04.

[11]  Rachid Guerraoui,et al.  The weakest failure detectors to boost obstruction-freedom , 2006, Distributed Computing.

[12]  Michael J. Fischer,et al.  The Consensus Problem in Unreliable Distributed Systems (A Brief Survey) , 1983, FCT.

[13]  Piotr Zielinski Automatic Classification of Eventual Failure Detectors , 2007, DISC.

[14]  Maurice Herlihy,et al.  The topological structure of asynchronous computability , 1999, JACM.

[15]  Joseph Y. Halpern,et al.  A knowledge-theoretic analysis of uniform distributed coordination and failure detectors , 2004, Distributed Computing.

[16]  Nancy A. Lynch,et al.  On the weakest failure detector ever , 2007, PODC.

[17]  Vassos Hadzilacos,et al.  On the Relationship Between the Atomic Commitment and Consensus Problems , 1990, Fault-Tolerant Distributed Computing.

[18]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.