Optimal Availability Quorum Systems: Theory and Practice

Quorum systems serve as a basic tool providing a uniform and reliable way to achieve coordination in a distributed system. They are useful for distributed and replicated databases, name servers, mutual exclusion, and distributed access control and signatures. The un-availability of a quorum system is the probability of the event that no live quorum exists in the system. When such an event occurs the service is completely halted. The un-availability is widely accepted as the measure by which quorum systems are evaluated. In this paper we characterize the optimal availability quorum system in the general case, when the failure probabilities may take any value in the range 0 > p>sub/sub 1. Then we deal with the practical scenario in which the failure probabilities are unknown, but can be estimated. We give a robust and efficient algorithm that calculates a near optimal quorum system based on the estimated failure probabilities.

[1]  Mamoru Maekawa,et al.  A N algorithm for mutual exclusion in decentralized systems , 1985, TOCS.

[2]  M. P. Herlihy REPLICATION METHODS FOR ABSTRACT DATA TYPES , 1984 .

[3]  Bernard Mans,et al.  Optimal Coteries and Voting Schemes , 1994, Inf. Process. Lett..

[4]  David Peleg,et al.  The Availability of Quorum Systems , 1995, Inf. Comput..

[5]  Michel Raynal,et al.  Algorithms for mutual exclusion , 1986 .

[6]  Hector Garcia-Molina,et al.  The Reliability of Voting Mechanisms , 1987, IEEE Transactions on Computers.

[7]  Hector Garcia-Molina,et al.  Distributed selective dissemination of information , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[8]  Idit Keidar,et al.  A Highly Available Paradigm for Consistent Object Replication , 1994 .

[9]  Moni Naor,et al.  Access Control and Signatures via Quorum Secret Sharing , 1998, IEEE Trans. Parallel Distributed Syst..

[10]  Hector Garcia-Molina,et al.  How to assign votes in a distributed system , 1985, JACM.

[11]  Yair Amir,et al.  Evaluating quorum systems over the Internet , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[12]  Richard Y. Kain,et al.  Vote Assignments in Weighted Voting Mechanisms , 1991, IEEE Trans. Computers.

[13]  Akhil Kumar,et al.  Hierarchical Quorum Consensus: A New Algorithm for Managing Replicated Data , 1991, IEEE Trans. Computers.

[14]  Mostafa H. Ammar,et al.  The Grid Protocol: A High Performance Scheme for Maintaining Replicated Data , 1992, IEEE Trans. Knowl. Data Eng..

[15]  Divyakant Agrawal,et al.  An efficient and fault-tolerant solution for distributed mutual exclusion , 1991, TOCS.

[16]  David Peleg,et al.  Crumbling walls: a class of practical and efficient quorum systems , 1995, PODC '95.

[17]  P. Erdös,et al.  INTERSECTION THEOREMS FOR SYSTEMS OF FINITE SETS , 1961 .

[18]  Akhil Kumar,et al.  A High Availability \sqrt{N} Hierarchical Grid Algorithm for Replicated Data , 1991, Inf. Process. Lett..

[19]  Mostafa H. Ammar,et al.  The grid protocol: a high performance scheme for maintaining replicated data , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[20]  Hector Garcia-Molina,et al.  Consistency in a partitioned network: a survey , 1985, CSUR.

[21]  Piotr Berman,et al.  Voting as the Optimal Static Pessimistic Scheme for Managing Replicated Data , 1994, IEEE Trans. Parallel Distributed Syst..

[22]  Moni Naor,et al.  The load, capacity and availability of quorum systems , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.