论文信息 - Hybrids on Steroids: SGX-Based High Performance BFT

Hybrids on Steroids: SGX-Based High Performance BFT

With the advent of trusted execution environments provided by recent general purpose processors, a class of replication protocols has become more attractive than ever: Protocols based on a hybrid fault model are able to tolerate arbitrary faults yet reduce the costs significantly compared to their traditional Byzantine relatives by employing a small subsystem trusted to only fail by crashing. Unfortunately, existing proposals have their own price: We are not aware of any hybrid protocol that is backed by a comprehensive formal specification, complicating the reasoning about correctness and implications. Moreover, current protocols of that class have to be performed largely sequentially. Hence, they are not well-prepared for just the modern multi-core processors that bring their very own fault model to a broad audience. In this paper, we present Hybster, a new hybrid state-machine replication protocol that is highly parallelizable and specified formally. With over 1 million operations per second using only four cores, the evaluation of our Intel SGX-based prototype implementation shows that Hybster makes hybrid state-machine replication a viable option even for today's very demanding critical services.

[1] Miguel Castro,et al. Using abstraction to improve fault tolerance , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[2] John Lane,et al. Byzantine replication under attack , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[3] Miguel Castro,et al. A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm , 1999 .

[4] Miguel Castro,et al. BASE: using abstraction to improve fault tolerance , 2001, SOSP.

[5] Miguel Oom Temudo de Castro,et al. Practical Byzantine fault tolerance , 1999, OSDI '99.

[6] Liuba Shrira,et al. HQ replication: a hybrid quorum protocol for byzantine fault tolerance , 2006, OSDI '06.

[7] Johannes Behl,et al. Consensus-Oriented Parallelization: How to Earn Your First Million , 2015, Middleware.

[8] Marko Vukolic,et al. The Quest for Scalable Blockchain Fabric: Proof-of-Work vs. BFT Replication , 2015, iNetSeC.

[9] Miguel Correia,et al. Worm-IT - A wormhole-based intrusion-tolerant group communication system , 2007, J. Syst. Softw..

[10] Rüdiger Kapitza,et al. Hypervisor-Based Efficient Proactive Recovery , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).

[11] Miguel Correia,et al. Spin One's Wheels? Byzantine Fault Tolerance with a Spinning Primary , 2009, 2009 28th IEEE International Symposium on Reliable Distributed Systems.

[12] Mahadev Konar,et al. ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[13] Leslie Lamport,et al. Reaching Agreement in the Presence of Faults , 1980, JACM.

[14] Yang Wang,et al. All about Eve: Execute-Verify Replication for Multi-Core Servers , 2012, OSDI.

[15] Michael Dahlin,et al. Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults , 2009, NSDI.

[16] Miguel Correia,et al. How to tolerate half less one Byzantine nodes in practical distributed systems , 2004, Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004..

[17] Arun Venkataramani,et al. ZZ and the art of practical BFT execution , 2011, EuroSys '11.

[18] R. Kapitza,et al. Hybster - A Highly Parallelizable Protocol for Hybrid Fault-Tolerant Service Replication , 2017 .

[19] Ramakrishna Kotla,et al. High throughput Byzantine fault tolerance , 2004, International Conference on Dependable Systems and Networks, 2004.

[20] Tobias Distler,et al. Resource-Efficient Byzantine Fault Tolerance , 2016, IEEE Transactions on Computers.

[21] John M. Rushby,et al. Design and verification of secure systems , 1981, SOSP.

[22] Robbert van Renesse,et al. Byzantine Chain Replication , 2012, OPODIS.

[23] Ramakrishna Kotla,et al. Zyzzyva , 2007, SOSP.

[24] Fred B. Schneider,et al. Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[25] Dong Zhou,et al. Rex: replication at the speed of multi-core , 2014, EuroSys '14.

[26] Michael K. Reiter,et al. Fault-scalable Byzantine fault-tolerant services , 2005, SOSP '05.

[27] Alysson Neves Bessani,et al. State Machine Replication for the Masses with BFT-SMART , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[28] Jacob R. Lorch,et al. TrInc: Small Trusted Hardware for Large Distributed Systems , 2009, NSDI.

[29] Arun Venkataramani,et al. Separating agreement from execution for byzantine fault tolerant services , 2003, SOSP '03.

[30] Elaine Shi,et al. The Honey Badger of BFT Protocols , 2016, CCS.

[31] Nancy A. Lynch,et al. Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[32] Miguel Correia,et al. EBAWA: Efficient Byzantine Agreement for Wide-Area Networks , 2010, 2010 IEEE 12th International Symposium on High Assurance Systems Engineering.

[33] Carlos V. Rozas,et al. Innovative instructions and software model for isolated execution , 2013, HASP '13.

[34] Michael K. Reiter,et al. Zzyzx: Scalable fault tolerance through Byzantine locking , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).

[35] Johannes Behl,et al. CheapBFT: resource-efficient byzantine fault tolerance , 2012, EuroSys '12.

[36] Fernando Pedone,et al. Rethinking State-Machine Replication for Parallelism , 2013, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[37] Marko Vukolic,et al. The next 700 BFT protocols , 2010, EuroSys '10.

[38] Vivien Quéma,et al. RBFT: Redundant Byzantine Fault Tolerance , 2013, 2013 IEEE 33rd International Conference on Distributed Computing Systems.

[39] Tobias Distler,et al. SPARE: Replicas on Hold , 2011, NDSS.

[40] André Schiper,et al. Achieving High-Throughput State Machine Replication in Multi-core Systems , 2013, 2013 IEEE 33rd International Conference on Distributed Computing Systems.

[41] Nancy A. Lynch,et al. Impossibility of distributed consensus with one faulty process , 1985, JACM.

[42] Alysson Neves Bessani,et al. From Byzantine Consensus to BFT State Machine Replication: A Latency-Optimal Transformation , 2012, 2012 Ninth European Dependable Computing Conference.

[43] Scott Shenker,et al. Attested append-only memory: making adversaries stick to their word , 2007, SOSP.

[44] Ramakrishna Kotla,et al. Zyzzyva: speculative byzantine fault tolerance , 2007, TOCS.

[45] Marko Vukolic,et al. The Next 700 BFT Protocols , 2015, ACM Trans. Comput. Syst..

[46] Tobias Distler,et al. Increasing performance in byzantine fault-tolerant systems with on-demand replica consistency , 2011, EuroSys '11.

[47] Miguel Correia,et al. Efficient Byzantine Fault-Tolerance , 2013, IEEE Transactions on Computers.