Prime: Byzantine Replication under Attack

Existing Byzantine-resilient replication protocols satisfy two standard correctness criteria, safety and liveness, even in the presence of Byzantine faults. The runtime performance of these protocols is most commonly assessed in the absence of processor faults and is usually good in that case. However, faulty processors can significantly degrade the performance of some protocols, limiting their practical utility in adversarial environments. This paper demonstrates the extent of performance degradation possible in some existing protocols that do satisfy liveness and that do perform well absent Byzantine faults. We propose a new performance-oriented correctness criterion that requires a consistent level of performance, even with Byzantine faults. We present a new Byzantine fault-tolerant replication protocol that meets the new correctness criterion and evaluate its performance in fault-free executions and when under attack.

[1]  John Lane,et al.  Byzantine replication under attack , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[2]  Michael K. Reiter,et al.  Fault-scalable Byzantine fault-tolerant services , 2005, SOSP '05.

[3]  Michael O. Rabin,et al.  Randomized byzantine generals , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[4]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[5]  Michael K. Reiter,et al.  Secure and scalable replication in Phalanx , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).

[6]  Miguel Correia,et al.  Spin One's Wheels? Byzantine Fault Tolerance with a Spinning Primary , 2009, 2009 28th IEEE International Symposium on Reliable Distributed Systems.

[7]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[8]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[9]  Jean-Philippe Martin,et al.  Fast Byzantine Consensus , 2006, IEEE Transactions on Dependable and Secure Computing.

[10]  Miguel Correia,et al.  Randomized Intrusion-Tolerant Asynchronous Services , 2006, International Conference on Dependable Systems and Networks (DSN'06).

[11]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[12]  Liuba Shrira,et al.  HQ replication: a hybrid quorum protocol for byzantine fault tolerance , 2006, OSDI '06.

[13]  Atul Singh,et al.  BFT Protocols Under Fire , 2008, NSDI.

[14]  Christian Cachin,et al.  Secure INtrusion-Tolerant Replication on the Internet , 2002, Proceedings International Conference on Dependable Systems and Networks.

[15]  Zheng Wang,et al.  An Architecture for Differentiated Services , 1998, RFC.

[16]  Miguel Correia,et al.  How to tolerate half less one Byzantine nodes in practical distributed systems , 2004, Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004..

[17]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[18]  John Lane,et al.  Steward: Scaling Byzantine Fault-Tolerant Replication to Wide Area Networks , 2010, IEEE Transactions on Dependable and Secure Computing.

[19]  John Lane,et al.  Customizable Fault Tolerance forWide-Area Replication , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).

[20]  Flaviu Cristian,et al.  Understanding fault-tolerant distributed systems , 1991, CACM.

[21]  Michael Dahlin,et al.  Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults , 2009, NSDI.

[22]  Jonathan Kirsch,et al.  Scaling Byzantine Fault-Tolerant Replication toWide Area Networks , 2006, International Conference on Dependable Systems and Networks (DSN'06).

[23]  David L. Black,et al.  An Architecture for Differentiated Service , 1998 .

[24]  Marek Karpinski,et al.  An XOR-based erasure-resilient coding scheme , 1995 .

[25]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[26]  Josef Widder,et al.  Implementing Reliable Distributed Real-Time Systems with the Theta-Model , 2005, OPODIS.

[27]  Ramakrishna Kotla,et al.  Zyzzyva , 2007, SOSP.

[28]  Ralph C. Merkle,et al.  Secrecy, authentication, and public key systems , 1979 .

[29]  Michael K. Reiter,et al.  The Rampart Toolkit for Building High-Integrity Services , 1994, Dagstuhl Seminar on Distributed Systems.

[30]  Roy Friedman,et al.  Practical Byzantine Group Communication , 2006, 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06).

[31]  Paulo Veríssimo,et al.  Intrusion-tolerant middleware: the road to automatic security , 2006, IEEE Security & Privacy.

[32]  David Mazières,et al.  Beyond One-Third Faulty Replicas in Byzantine Fault Tolerant Systems , 2007, NSDI.

[33]  Matthias Fitzi,et al.  Optimally efficient multi-valued byzantine agreement , 2006, PODC '06.

[34]  Stefano Tessaro,et al.  Asynchronous verifiable information dispersal , 2005, 24th IEEE Symposium on Reliable Distributed Systems (SRDS'05).

[35]  Michael Ben-Or,et al.  Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.

[36]  Jorge C. A. de Figueiredo,et al.  How bad are wrong suspicions? towards adaptive distributed protocols , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[37]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[38]  O. Antoine,et al.  Theory of Error-correcting Codes , 2022 .

[39]  Josef Widder,et al.  Implementing reliable distributed real-time systems with the Θ-model , 2006 .

[40]  Michael Dahlin,et al.  BAR fault tolerance for cooperative services , 2005, SOSP '05.

[41]  Arun Venkataramani,et al.  Separating agreement from execution for byzantine fault tolerant services , 2003, SOSP '03.

[42]  Neeraj Suri,et al.  The Fail-Heterogeneous Architectural Model , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).

[43]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[44]  Gabriel Bracha,et al.  An asynchronous [(n - 1)/3]-resilient consensus protocol , 1984, PODC '84.

[45]  Michael K. Reiter,et al.  Byzantine quorum systems , 1997, STOC '97.