A simple quorum reconfiguration for open distributed environments

Synchronizations adopting quorum consensus are the well-known solutions of some fundamental issues in the study of distributed mutual exclusion and replica control problems. Mechanisms to reconfigure quorum structure in open distributed environments are necessary since the membership changes of such systems (i.e., the joining and leaving members) may decrease quorum availability. Many algorithms have been proposed to this problem, however, they mostly change the quorum system totally thus any operation cannot be performed while system in the reconfiguration process. This paper presents a simple quorum reconfiguration algorithm in open distributed computing systems that can evolve their behavior based on membership changes in the environment. The algorithm is easy to use since it simply implements the two quorum operations called join-replace and join-cross. The join-replace operation is used when a set of nodes have leaved from the system while some others are joining, and the join-cross is defined and used if there is only a set of joining nodes enter the system. The great advantages of the algorithm are the ability to complete any operation before a new quorum structure is fully constructed during reconfiguration thus system does not enter the halt state with a wait-avoidance characteristic, and it directly adopts quorum consensus in the static environments without any change to the protocol. Moreover, an extra mapping procedure is unnecessary to be given since the algorithm only works in the logical space.

[1]  S.-T. Huang,et al.  Obtaining nondominated k-coteries for fault-tolerant distributed k-mutual exclusion , 1994, Proceedings of 1994 International Conference on Parallel and Distributed Systems.

[2]  Bobby Bhattacharjee,et al.  Multi-dimensional quorum sets for read-few write-many replica control protocols , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..

[3]  Richard D. Schlichting,et al.  Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.

[4]  Masaaki Mizuno,et al.  Coterie Join Algorithm , 1992, IEEE Trans. Parallel Distributed Syst..

[5]  Hector Garcia-Molina,et al.  How to assign votes in a distributed system , 1985, JACM.

[6]  Jehan-François Pâris,et al.  Dynamic management of highly replicated data , 1992, [1992] Proceedings 11th Symposium on Reliable Distributed Systems.

[7]  Philip A. Bernstein,et al.  The failure and recovery problem for replicated databases , 1983, PODC '83.

[8]  Moni Naor,et al.  Scalable and dynamic quorum systems , 2003, PODC '03.

[9]  Divyakant Agrawal,et al.  An efficient and fault-tolerant solution for distributed mutual exclusion , 1991, TOCS.

[10]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.