Failure detectors encapsulate fairness

Failure detectors are commonly viewed as abstractions for the synchronism present in distributed system models. However, investigations into the exact amount of synchronism encapsulated by a given failure detector have met with limited success. The reason for this is that traditionally, models of partial synchrony are specified with respect to real time, but failure detectors do not encapsulate real time. Instead, we argue that failure detectors encapsulate the fairness in computation and communication. Fairness is a measure of the number of steps executed by one process relative either to the number of steps taken by another process or relative to the duration for which a message is in transit. We argue that oracles are substitutable for the fairness properties (rather than real-time properties) of partially synchronous systems. We propose four fairness-based models of partial synchrony and demonstrate that they are, in fact, the 'weakest systems models' to implement the canonical failure detectors from the Chandra-Toueg hierarchy.

[1]  Marcos K. Aguilera,et al.  Stable Leader Election , 2001, DISC.

[2]  Mikel Larrea,et al.  Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems , 1999, DISC.

[3]  Srikanth Sastry,et al.  Wait-Free Dining Under Eventual Weak Exclusion , 2008, ICDCN.

[4]  Christof Fetzer,et al.  The message classification model , 1998, PODC '98.

[5]  Rachid Guerraoui,et al.  Mutual exclusion in asynchronous systems with failure detectors , 2005, J. Parallel Distributed Comput..

[6]  J. Marchant In search of lost time , 2006, Nature.

[7]  Eli Gafni,et al.  Structured derivations of consensus algorithms for failure detectors , 1998, PODC '98.

[8]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[9]  Paolo A. G. Sivilotti,et al.  Dining philosophers with crash locality 1 , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[10]  Michel Raynal,et al.  The Iterated Restricted Immediate Snapshot Model , 2008, COCOON.

[11]  Martin Biely,et al.  Relating Stabilizing Timing Assumptions to Stabilizing Failure Detectors Regarding Solvability and Efficiency , 2007, SSS.

[12]  Rachid Guerraoui,et al.  The weakest failure detectors to solve certain fundamental problems in distributed computing , 2004, PODC '04.

[13]  Rachid Guerraoui,et al.  The weakest failure detectors to boost obstruction-freedom , 2006, Distributed Computing.

[14]  Petr Kuznetsov,et al.  The weakest failure detector for solving k-set agreement , 2009, PODC '09.

[15]  Sam Toueg,et al.  Every problem has a weakest failure detector , 2008, PODC '08.

[16]  Marcos K. Aguilera,et al.  On the quality of service of failure detectors , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[17]  Christof Fetzer,et al.  On the Possibility of Consensus in Asynchronous Systems with Finite Average Response Times , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[18]  Dahlia Malkhi,et al.  Chasing the Weakest System Model for Implementing Ω and Consensus , 2009, IEEE Transactions on Dependable and Secure Computing.

[19]  Michel Raynal,et al.  Looking for the Weakest Failure Detector for k-Set Agreement in Message-Passing Systems: Is ${\it \Pi}_k${\it \Pi}_k the End of the Road? , 2009, SSS.

[20]  K. Mani Chandy,et al.  The drinking philosophers problem , 1984, ACM Trans. Program. Lang. Syst..

[21]  Dahlia Malkhi,et al.  Omega Meets Paxos: Leader Election and Stability Without Eventual Timely Links , 2005, DISC.

[22]  Achour Mostéfaoui,et al.  A Time-free Assumption to Implement Eventual Leadership , 2006, Parallel Process. Lett..

[23]  Peter Robinson,et al.  Weak Synchrony Models and Failure Detectors for Message Passing (k-)Set Agreement , 2009, OPODIS.

[24]  Flaviu Cristian,et al.  The Timed Asynchronous Distributed System Model , 1999, IEEE Trans. Parallel Distributed Syst..

[25]  Achour Mostéfaoui,et al.  An introduction to oracles for asynchronous distributed systems , 2002, Future Gener. Comput. Syst..

[26]  Marcos K. Aguilera,et al.  On Quiescent Reliable Communication , 2000, SIAM J. Comput..

[27]  Vassos Hadzilacos,et al.  Using Failure Detectors to Solve Consensus in Asynchronous Sharde-Memory Systems (Extended Abstract) , 1994, WDAG.

[28]  Peter Robinson,et al.  The Asynchronous Bounded-Cycle Model , 2008, SSS.

[29]  Faith Ellen,et al.  Hundreds of impossibility results for distributed computing , 2003, Distributed Computing.

[30]  Srikanth Sastry,et al.  Eventually Perfect Failure Detectors Using ADD Channels , 2007, ISPA.

[31]  Michel Raynal,et al.  Failure Detectors as Schedulers (An Algorithmically-Reasoned Characterization) , 2007 .

[32]  Marcos K. Aguilera,et al.  Communication-efficient leader election and consensus with limited link synchrony , 2004, PODC '04.

[33]  Jennifer L. Welch,et al.  Crash fault detection in celerating environments , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[34]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[35]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[36]  André Schiper,et al.  Stubborn Communication Channels , 1998 .

[37]  Marcos K. Aguilera,et al.  On implementing omega in systems with weak reliability and synchrony assumptions , 2008, Distributed Computing.

[38]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[39]  Achour Mostéfaoui,et al.  Asynchronous implementation of failure detectors , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[40]  Michel Raynal,et al.  From an Asynchronous Intermittent Rotating Star to an Eventual Leader , 2010, IEEE Transactions on Parallel and Distributed Systems.

[41]  Danny Dolev,et al.  On the minimal synchronism needed for distributed consensus , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[42]  M. Raynal,et al.  Looking for the Weakest Failure Detector for k-Set Agreement in Message-Passing Systems: Is Πk the End of the Road? , 1929 .

[43]  Achour Mostéfaoui,et al.  On the computability power and the robustness of set agreement-oriented failure detector classes , 2008, Distributed Computing.

[44]  Rachid Guerraoui,et al.  Synchronous system and perfect failure detector: Solvability and efficiency issues , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[45]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[46]  Josef Widder,et al.  Implementing Reliable Distributed Real-Time Systems with the Theta-Model , 2005, OPODIS.