Specifying and Implementing an Eventual Leader Service for Dynamic Systems

The election of an eventual leader in an asynchronous system prone to process crashes is an important problem of fault-tolerant distributed computing. This problem is known as the implementation of the failure detector Omega. Nearly all papers that propose algorithms implementing such an eventual leader service consider a static system. In contrast this paper considers a dynamic system, i.e., a system in which processes can enter and leave. The paper has three contributions. It first proposes a specification of $\Omega$ suited to dynamic systems. Then, it presents and proves correct an algorithm implementing this specification. Finally, the paper discusses the notion of an eventual leader suited to dynamic systems. It introduces an additional property related to system stability. The design of an algorithm satisfying this last property remains an open challenging problem.

[1]  Michel Raynal,et al.  From an Asynchronous Intermittent Rotating Star to an Eventual Leader , 2010, IEEE Transactions on Parallel and Distributed Systems.

[2]  Weihai Yu,et al.  Decentralised web-services orchestration with continuation-passing messaging , 2011, Int. J. Web Grid Serv..

[3]  Fatos Xhafa,et al.  A parallel grid-based implementation for real-time processing of event log data of collaborative applications , 2010, Int. J. Web Grid Serv..

[4]  Michael Merritt,et al.  Computing with infinitely many processes , 2013, Inf. Comput..

[5]  Daniel Retkowitz,et al.  Dynamic Adaptability for Smart Environments , 2008, DAIS.

[6]  Maxim Raya,et al.  Securing vehicular ad hoc networks , 2007, J. Comput. Secur..

[7]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[8]  Mikel Larrea,et al.  Implementing the Omega failure detector in the crash-recovery failure model , 2009, J. Comput. Syst. Sci..

[9]  Michel Raynal,et al.  Eventual Leader Election with Weak Assumptions on Initial Knowledge, Communication Reliability, and Synchrony , 2006, International Conference on Dependable Systems and Networks (DSN'06).

[10]  Michel Raynal,et al.  Communication and Agreement Abstractions for Fault-Tolerant Asynchronous Distributed Systems , 2010, Synthesis Lectures on Distributed Computing Theory.

[11]  Dahlia Malkhi,et al.  Chasing the Weakest System Model for Implementing Ω and Consensus , 2009, IEEE Transactions on Dependable and Secure Computing.

[12]  Achour Mostéfaoui,et al.  From static distributed systems to dynamic systems , 2005, 24th IEEE Symposium on Reliable Distributed Systems (SRDS'05).

[13]  Dahlia Malkhi,et al.  Omega Meets Paxos: Leader Election and Stability Without Eventual Timely Links , 2005, DISC.

[14]  Marcos K. Aguilera,et al.  On implementing omega with weak reliability and synchrony assumptions , 2003, PODC '03.

[15]  Marcos K. Aguilera,et al.  Communication-efficient leader election and consensus with limited link synchrony , 2004, PODC '04.

[16]  Michel Raynal,et al.  A Timing Assumption and Two t-Resilient Protocols for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems , 2007, 10th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC'07).

[17]  Mikel Larrea,et al.  Optimal implementation of the weakest failure detector for solving consensus , 2000, Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000.

[18]  Achour Mostéfaoui,et al.  A Time-free Assumption to Implement Eventual Leadership , 2006, Parallel Process. Lett..

[19]  Marcos K. Aguilera,et al.  Stable Leader Election , 2001, DISC.

[20]  Achour Mostéfaoui,et al.  Time-free and timer-based assumptions can be combined to obtain eventual leadership , 2006, IEEE Transactions on Parallel and Distributed Systems.

[21]  Martin Biely,et al.  Optimal message-driven implementations of omega with mute processes , 2009, TAAS.

[22]  Abdul Jabbar,et al.  Highly-Dynamic Cross-Layered Aeronautical Network Architecture , 2011, IEEE Transactions on Aerospace and Electronic Systems.

[23]  Marcos K. Aguilera,et al.  A pleasant stroll through the land of infinitely many creatures , 2004, SIGA.

[24]  Rachid Guerraoui,et al.  The information structure of indulgent consensus , 2004, IEEE Transactions on Computers.

[25]  Michel Rennes Failure Detectors for Asynchronous Distributed Systems: An Introduction , 2008, Wiley Encyclopedia of Computer Science and Engineering.

[26]  Mikel Larrea,et al.  Communication-efficient leader election in crash-recovery systems , 2011, J. Syst. Softw..

[27]  Carole Delporte-Gallet,et al.  Robust Stabilizing Leader Election , 2007, SSS.

[28]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[29]  Roberto Baldoni,et al.  Eventual Leader Election in Infinite Arrival Message-Passing System Model with Bounded Concurrency , 2010, 2010 European Dependable Computing Conference.

[30]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[31]  Won Kim,et al.  Cloud computing adoption , 2011, Int. J. Web Grid Serv..

[32]  Mikel Larrea,et al.  A simple and communication-efficient Omega algorithm in the crash-recovery model , 2010, Inf. Process. Lett..

[33]  Marcos K. Aguilera,et al.  On implementing omega in systems with weak reliability and synchrony assumptions , 2008, Distributed Computing.

[34]  Nancy A. Lynch,et al.  A general characterization of indulgence , 2008, TAAS.