Zyzzyva: speculative byzantine fault tolerance

We present Zyzzyva, a protocol that uses speculation to reduce the cost and simplify the design of Byzantine fault tolerant state machine replication. In Zyzzyva, replicas respond to a client's request without first running an expensive three-phase commit protocol to reach agreement on the order in which the request must be processed. Instead, they optimistically adopt the order proposed by the primary and respond immediately to the client. Replicas can thus become temporarily inconsistent with one another, but clients detect inconsistencies, help correct replicas converge on a single total ordering of requests, and only rely on responses that are consistent with this total order. This approach allows Zyzzyva to reduce replication overheads to near their theoretical minimal.

[1]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[2]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[3]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[4]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[5]  Leslie Lamport,et al.  Using Time Instead of Timeout for Fault-Tolerant Distributed Systems. , 1984, TOPL.

[6]  Nancy G. Leveson,et al.  An experimental evaluation of the assumption of independence in multiversion programming , 1986, IEEE Transactions on Software Engineering.

[7]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[8]  William I. Nowicki,et al.  NFS: Network File System Protocol specification , 1989, RFC.

[9]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[10]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[11]  Michael Williams,et al.  Replication in the harp file system , 1991, SOSP '91.

[12]  Michael K. Reiter,et al.  The Rampart Toolkit for Building High-Integrity Services , 1994, Dagstuhl Seminar on Distributed Systems.

[13]  Mihir Bellare,et al.  A New Paradigm for Collision-Free Hashing: Incrementality at Reduced Cost , 1997, EUROCRYPT.

[14]  Michael K. Reiter,et al.  Byzantine quorum systems , 1997, STOC '97.

[15]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[16]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[17]  Norman C. Hutchinson,et al.  Deciding when to forget in the Elephant file system , 1999, SOSP.

[18]  Miguel Castro,et al.  Proactive recovery in a Byzantine-fault-tolerant system , 2000, OSDI.

[19]  Gustavo Alonso,et al.  Understanding replication in databases and distributed systems , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[20]  Miguel Castro,et al.  BASE: using abstraction to improve fault tolerance , 2001, SOSP.

[21]  Miguel Castro,et al.  Using abstraction to improve fault tolerance , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[22]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[23]  Lorenzo Alvisi,et al.  Modeling the effect of technology trends on the soft error rate of combinational logic , 2002, Proceedings International Conference on Dependable Systems and Networks.

[24]  Samuel T. King,et al.  Backtracking intrusions , 2003, SOSP '03.

[25]  Arun Venkataramani,et al.  Separating agreement from execution for byzantine fault tolerant services , 2003, SOSP '03.

[26]  Archana Ganapathi,et al.  Why Do Internet Services Fail, and What Can Be Done About It? , 2002, USENIX Symposium on Internet Technologies and Systems.

[27]  GhemawatSanjay,et al.  The Google file system , 2003 .

[28]  Lisa Spainhower,et al.  Commercial fault tolerance: a tale of two systems , 2004, IEEE Transactions on Dependable and Secure Computing.

[29]  Ramakrishna Kotla,et al.  High throughput Byzantine fault tolerance , 2004, International Conference on Dependable Systems and Networks, 2004.

[30]  Dawn M. Cappelli,et al.  Insider Threat Study: Computer System Sabotage in Critical Infrastructure Sectors , 2005 .

[31]  Joel S. Emer,et al.  The soft error problem: an architectural perspective , 2005, 11th International Symposium on High-Performance Computer Architecture.

[32]  R. Guerraoui,et al.  Best-Case Complexity of Asynchronous Byzantine Consensus , 2005 .

[33]  Jason Flinn,et al.  Speculative execution in a distributed file system , 2005, SOSP '05.

[34]  Andrea C. Arpaci-Dusseau,et al.  IRON file systems , 2005, SOSP '05.

[35]  Michael Dahlin,et al.  BAR fault tolerance for cooperative services , 2005, SOSP '05.

[36]  Michael K. Reiter,et al.  Fault-scalable Byzantine fault-tolerant services , 2005, SOSP '05.

[37]  Junfeng Yang,et al.  Using model checking to find serious file system errors , 2004, TOCS.

[38]  Leslie Lamport,et al.  Lower bounds for asynchronous consensus , 2006, Distributed Computing.

[39]  Junfeng Yang,et al.  EXPLODE: a lightweight, general system for finding serious storage system errors , 2006, OSDI '06.

[40]  Liuba Shrira,et al.  HQ replication: a hybrid quorum protocol for byzantine fault tolerance , 2006, OSDI '06.

[41]  Jean-Philippe Martin,et al.  Fast Byzantine Consensus , 2006, IEEE Transactions on Dependable and Secure Computing.

[42]  Jason Flinn,et al.  Rethink the sync , 2006, OSDI '06.

[43]  Ramakrishna Kotla,et al.  SafeStore: A Durable and Practical Storage System , 2007, USENIX Annual Technical Conference.

[44]  David Mazières,et al.  Beyond One-Third Faulty Replicas in Byzantine Fault Tolerant Systems , 2007, NSDI.

[45]  Ozalp Babaoglu,et al.  ACM Transactions on Computer Systems , 2007 .

[46]  Scott Shenker,et al.  Attested append-only memory: making adversaries stick to their word , 2007, SOSP.

[47]  Ramakrishna Kotla,et al.  Zyzzyva , 2007, SOSP.

[48]  Ramakrishna Kotla,et al.  Xbft: byzantine fault tolerance with high performance, low cost, and aggressive fault isolation , 2008 .

[49]  Atul Singh,et al.  BFT Protocols Under Fire , 2008, NSDI.

[50]  Jason Flinn,et al.  Tolerating Latency in Replicated State Machines Through Client Speculation , 2009, NSDI.

[51]  Sangmin Lee,et al.  Upright cluster services , 2009, SOSP '09.

[52]  Petr Kuznetsov,et al.  Zeno: Eventually Consistent Byzantine-Fault Tolerance , 2009, NSDI.