论文信息 - CheapBFT: resource-efficient byzantine fault tolerance

CheapBFT: resource-efficient byzantine fault tolerance

One of the main reasons why Byzantine fault-tolerant (BFT) systems are not widely used lies in their high resource consumption: 3f+1 replicas are necessary to tolerate only f faults. Recent works have been able to reduce the minimum number of replicas to 2f+1 by relying on a trusted subsystem that prevents a replica from making conflicting statements to other replicas without being detected. Nevertheless, having been designed with the focus on fault handling, these systems still employ a majority of replicas during normal-case operation for seemingly redundant work. Furthermore, the trusted subsystems available trade off performance for security; that is, they either achieve high throughput or they come with a small trusted computing base. This paper presents CheapBFT, a BFT system that, for the first time, tolerates that all but one of the replicas active in normal-case operation become faulty. CheapBFT runs a composite agreement protocol and exploits passive replication to save resources; in the absence of faults, it requires that only f+1 replicas actively agree on client requests and execute them. In case of suspected faulty behavior, CheapBFT triggers a transition protocol that activates f extra passive replicas and brings all non-faulty replicas into a consistent state again. This approach, for example, allows the system to safely switch to another, more resilient agreement protocol. CheapBFT relies on an FPGA-based trusted subsystem for the authentication of protocol messages that provides high performance and comprises a small trusted computing base.

[1] Fred B. Schneider,et al. Implementing trustworthy services using replicated state machines , 2005, IEEE Security & Privacy Magazine.

[2] Michael Dahlin,et al. Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults , 2009, NSDI.

[3] Arun Venkataramani,et al. ZZ and the art of practical BFT execution , 2011, EuroSys '11.

[4] Rüdiger Kapitza,et al. Hypervisor-Based Efficient Proactive Recovery , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).

[5] Heiko Stamer,et al. A Software-Based Trusted Platform Module Emulator , 2008, TRUST.

[6] Peter Y. K. Cheung,et al. Fault tolerance and reliability in field-programmable gate arrays , 2010, IET Computers & Digital Techniques.

[7] Tobias Distler,et al. Increasing performance in byzantine fault-tolerant systems with on-demand replica consistency , 2011, EuroSys '11.

[8] Miguel Correia,et al. Efficient Byzantine Fault-Tolerance , 2013, IEEE Transactions on Computers.

[9] Scott Shenker,et al. Attested append-only memory: making adversaries stick to their word , 2007, SOSP.

[10] Andreas Haeberlen,et al. Practical accountability for distributed systems , 2007 .

[11] Jacob R. Lorch,et al. TrInc: Small Trusted Hardware for Large Distributed Systems , 2009, NSDI.

[12] Arun Venkataramani,et al. Separating agreement from execution for byzantine fault tolerant services , 2003, SOSP '03.

[13] Marko Vukolic,et al. The next 700 BFT protocols , 2010, EuroSys '10.

[14] Ariel J. Feldman,et al. SPORC: Group Collaboration using Untrusted Cloud Resources , 2010, OSDI.

[15] Miguel Correia,et al. Spin One's Wheels? Byzantine Fault Tolerance with a Spinning Primary , 2009, 2009 28th IEEE International Symposium on Reliable Distributed Systems.

[16] Miguel Correia,et al. How to tolerate half less one Byzantine nodes in practical distributed systems , 2004, Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004..

[17] Ahmad-Reza Sadeghi,et al. Reconfigurable trusted computing in hardware , 2007, STC '07.

[18] Tobias Distler,et al. SPARE: Replicas on Hold , 2011, NDSS.

[19] Christian Cachin,et al. Distributing trust on the Internet , 2001, 2001 International Conference on Dependable Systems and Networks.

[20] Idit Keidar,et al. Venus: verification for untrusted cloud storage , 2010, CCSW '10.

[21] Peter Williams,et al. The Blind Stone Tablet: Outsourcing Durability to Untrusted Parties , 2009, NDSS.

[22] Paul England,et al. Para-Virtualized TPM Sharing , 2008, TRUST.

[23] Ramakrishna Kotla,et al. Zyzzyva: speculative byzantine fault tolerance , 2007, TOCS.

[24] Mahadev Konar,et al. ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[25] Petr Kuznetsov,et al. BFTW3: why? when? where? workshop on the theory and practice of byzantine fault tolerance , 2010, SIGA.

[26] Miguel Castro,et al. Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[27] Jehan-François Pâris,et al. Voting with Witnesses: A Constistency Scheme for Replicated Files , 1986, ICDCS.

[28] Leslie Lamport,et al. Cheap Paxos , 2004, International Conference on Dependable Systems and Networks, 2004.

[29] Edmund L. Wong,et al. BFT: the time is now , 2008, LADIS '08.

[30] Ramakrishna Kotla,et al. High throughput Byzantine fault tolerance , 2004, International Conference on Dependable Systems and Networks, 2004.

[31] Stefan Berger,et al. vTPM: Virtualizing the Trusted Platform Module , 2006, USENIX Security Symposium.

[32] Udo Steinberg,et al. NOVA: a microhypervisor-based secure virtualization architecture , 2010, EuroSys '10.

[33] Miguel Correia,et al. EBAWA: Efficient Byzantine Agreement for Wide-Area Networks , 2010, 2010 IEEE 12th International Symposium on High Assurance Systems Engineering.

[34] Andreas Haeberlen,et al. PeerReview: practical accountability for distributed systems , 2007, SOSP.