Real-time Leader Election Algorithm in Asynchronous Distributed Systems

In this paper, a failure detection algorithm for efficient detection of failure occurred in some parts of a system consisting a process group in non-synchronized distributed system is proposed, and with applying the algorithm, a RTLE algorithm to make election of a new leader in case failure of the process working as a leader is also proposed. It is designed to make lists of currently working processes which are used for making list of next leader candidates in real-time. Each process references the list of candidates and the process with the highest priority works as a next leader in case of current leader failure. The evaluation results showed that proposed failure detection algorithm is more stable compared to A. Mostefaoui's algorithm even in environment with network delays. Proposed RTLE algorithm is also much more stable compared to other algorithms even in environment with changes in number of processes, network delay and number of additionally crashed leader candidates.

[1]  Ernest J. H. Chang,et al.  An improved algorithm for decentralized extrema-finding in circular configurations of processes , 1979, CACM.

[2]  Hector Garcia-Molina,et al.  Elections in a Distributed Computing System , 1982, IEEE Transactions on Computers.

[3]  Bing Wei,et al.  A self-stabilizing leader election algorithm in OneHop DHT , 2009, 2009 Fourth International Conference on Communications and Networking in China.

[4]  Hoon Choi,et al.  The Fast Bully Algorithm: For Electing a Coordinator Process in Distributed Systems , 2002, ICOIN.

[5]  Benjamin Satzger,et al.  A new adaptive accrual failure detector for dependable distributed systems , 2007, SAC '07.

[6]  김진혁,et al.  액티브-스탠바이 데이터 처리 시스템에서의 간단한 고장 검출 알고리즘 , 2010 .

[7]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[8]  Marcos K. Aguilera,et al.  On the quality of service of failure detectors , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[9]  Du Ling,et al.  A coordinator election algorithm for P2P MMOG , 2008 .

[10]  Gérard Le Lann,et al.  Distributed Systems - Towards a Formal Approach , 1977, IFIP Congress.

[11]  Marcos K. Aguilera,et al.  On implementing omega with weak reliability and synchrony assumptions , 2003, PODC '03.

[12]  Marcos K. Aguilera,et al.  On implementing omega in systems with weak reliability and synchrony assumptions , 2008, Distributed Computing.

[13]  Achour Mostéfaoui,et al.  On the Fly Estimation of the Processes that Are Alive in an Asynchronous Message-Passing System , 2009, IEEE Transactions on Parallel and Distributed Systems.

[14]  Benjamin Satzger,et al.  A Lazy Monitoring Approach for Heartbeat-Style Failure Detectors , 2008, 2008 Third International Conference on Availability, Reliability and Security.

[15]  Naohiro Hayashibara,et al.  The φ Accrual Failure Detector , 2004 .