Refined quorum systems

It is considered good distributed computing practice to devise object implementations that tolerate contention, periods of asynchrony and a large number of failures, but perform fast if few failures occur, the system is synchronous and there is no contention. This paper initiates the first study of quorum systems that help design such implementations by encompassing, at the same time, optimal resilience (just like traditional quorum systems), as well as optimal best-case complexity (unlike traditional quorum systems). We introduce the notion of a refined quorum system (RQS) of some set S as a set of three classes of subsets (quorums) of S: firstclass quorums are also second class quorums, themselves being also third class quorums. First class quorums have large intersections with all other quorums, second class quorums typically have smaller intersections with those of the third class, the latter simply correspond to traditional quorums. Intuitively, under uncontended and synchronous conditions, a distributed object implementation would expedite an operation if a quorum of the first class is accessed, then degrade gracefully depending on whether a quorum of the second or the third class is accessed. Our notion of refined quorum system is devised assuming a general adversary structure, and this basically allows relying on refined quorum systems to relax the assumption of independent process failures, often questioned in practice. We illustrate the power of refined quorums by introducing two new optimal Byzantine-resilient distributed object implementations: anatomic storage and a consensus algorithm. Both match previously established resilience and best-case complexity lower bounds, closing open gaps, as well as new complexity bounds we establish here.

[1]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[2]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[3]  Michael Dahlin,et al.  Small byzantine quorum systems , 2002, Proceedings International Conference on Dependable Systems and Networks.

[4]  Idit Keidar,et al.  Byzantine disk paxos: optimal resilience with byzantine shared memory , 2004, PODC '04.

[5]  Ueli Maurer,et al.  Complete characterization of adversaries tolerable in secure multi-party computation (extended abstract) , 1997, PODC '97.

[6]  Philip M. Thambidurai,et al.  Interactive consistency with multiple failure modes , 1988, Proceedings [1988] Seventh Symposium on Reliable Distributed Systems.

[7]  Robert H. Thomas,et al.  A Majority consensus approach to concurrency control for multiple copy databases , 1979, ACM Trans. Database Syst..

[8]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[9]  Marko Vukolic,et al.  Abstractions for asynchronous distributed computing with malicious players , 2008 .

[10]  Rachid Guerraoui,et al.  How fast can a distributed atomic read be? , 2004, PODC '04.

[11]  Michael K. Reiter,et al.  Low-overhead byzantine fault-tolerant storage , 2007, SOSP.

[12]  Piotr Zieliński,et al.  Optimistically Terminating Consensus , 2006 .

[13]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[14]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[15]  Rachid Guerraoui,et al.  Amnesic Distributed Storage , 2007, DISC.

[16]  Butler W. Lampson,et al.  The ABCD's of Paxos , 2001, PODC '01.

[17]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[18]  Michael Dahlin,et al.  Minimal Byzantine Storage , 2002, DISC.

[19]  Arun Venkataramani,et al.  Separating agreement from execution for byzantine fault tolerant services , 2003, SOSP '03.

[20]  Arif Merchant,et al.  FAB: building distributed enterprise disk arrays from commodity components , 2004, ASPLOS XI.

[21]  Sam Toueg,et al.  Fault-tolerant wait-free shared objects , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[22]  Bruce M. Maggs,et al.  Quorum placement in networks: minimizing network congestion , 2006, PODC '06.

[23]  Keith Marzullo,et al.  A framework for the design of dependent‐failure algorithms , 2007, Concurr. Comput. Pract. Exp..

[24]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[25]  Marko Vukolic,et al.  How fast can a very robust read be? , 2006, PODC '06.

[26]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[27]  Michael K. Reiter,et al.  Byzantine quorum systems , 1997, STOC '97.

[28]  Rida A. Bazzi,et al.  Bounded wait-free implementation of optimally resilient byzantine storage without (unproven) cryptographic assumptions , 2007, PODC '07.

[29]  Rachid Guerraoui,et al.  Indulgent algorithms (preliminary version) , 2000, PODC '00.

[30]  Leslie Lamport,et al.  On interprocess communication , 1986, Distributed Computing.

[31]  HariGovind V. Ramasamy,et al.  Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast , 2005, OPODIS.

[32]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[33]  Marko Vukolic,et al.  Lucky Read/Write Access to Robust Atomic Storage , 2006, International Conference on Dependable Systems and Networks (DSN'06).

[34]  Keith Marzullo,et al.  Synchronous Consensus for dependent process failures , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[35]  Michael K. Reiter,et al.  Efficient Byzantine-tolerant erasure-coded storage , 2004, International Conference on Dependable Systems and Networks, 2004.

[36]  Idit Keidar,et al.  Timeliness, failure-detectors, and consensus performance , 2006, PODC '06.

[37]  Marko Vukolic,et al.  Refined quorum systems , 2007, PODC '07.

[38]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[39]  Leslie Lamport Lower bounds for asynchronous consensus , 2003 .

[40]  Michael K. Reiter,et al.  Fault-scalable Byzantine fault-tolerant services , 2005, SOSP '05.

[41]  Rida A. Bazzi,et al.  Bounded Wait-Free Implementation of Optimally Resilient Byzantine Storage Without (Unproven) Cryptographic Assumptions , 2007, DISC.

[42]  Eli Gafni,et al.  Round-by-Round Fault Detectors: Unifying Synchrony and Asynchrony (Extended Abstract). , 1998, PODC 1998.

[43]  Michael K. Reiter,et al.  Fault-scalable Byzantine fault-tolerant services , 2005, SOSP '05.

[44]  Ramakrishna Kotla,et al.  Zyzzyva , 2007, SOSP.

[45]  R. Guerraoui,et al.  Best-Case Complexity of Asynchronous Byzantine Consensus , 2005 .

[46]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[47]  Leslie Lamport,et al.  Fast Paxos , 2006, Distributed Computing.

[48]  Liuba Shrira,et al.  HQ replication: a hybrid quorum protocol for byzantine fault tolerance , 2006, OSDI '06.

[49]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[50]  Jean-Philippe Martin,et al.  Fast Byzantine Consensus , 2006, IEEE Transactions on Dependable and Secure Computing.

[51]  Keith Marzullo,et al.  A framework for the design of dependent-failure algorithms: Research Articles , 2007 .

[52]  Hugo Krawczyk,et al.  UMAC: Fast and Secure Message Authentication , 1999, CRYPTO.

[53]  Moni Naor,et al.  The Load, Capacity, and Availability of Quorum Systems , 1998, SIAM J. Comput..

[54]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[55]  Piotr Zielinski,et al.  Optimistically Terminating Consensus: All Asynchronous Consensus Protocols in One Framework , 2006, 2006 Fifth International Symposium on Parallel and Distributed Computing.

[56]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[57]  Nancy A. Lynch,et al.  An introduction to input/output automata , 1989 .

[58]  Rida A. Bazzi,et al.  Non-skipping Timestamps for Byzantine Data Storage Systems , 2004, DISC.

[59]  Eli Gafni,et al.  Round-by-round fault detectors (extended abstract): unifying synchrony and asynchrony , 1998, PODC '98.

[60]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.