State Machine Replication for the Masses with BFT-SMART

The last fifteen years have seen an impressive amount of work on protocols for Byzantine fault-tolerant (BFT) state machine replication (SMR). However, there is still a need for practical and reliable software libraries implementing this technique. BFT-SMART is an open-source Java-based library implementing robust BFT state machine replication. Some of the key features of this library that distinguishes it from similar works (e.g., PBFT and UpRight) are improved reliability, modularity as a first-class property, multicore-awareness, reconfiguration support and a flexible programming interface. When compared to other SMR libraries, BFT-SMART achieves better performance and is able to withstand a number of real-world faults that previous implementations cannot.

[1]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[2]  David E. Culler,et al.  SEDA: an architecture for well-conditioned, scalable internet services , 2001, SOSP.

[3]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[4]  Michael K. Reiter,et al.  Fault-scalable Byzantine fault-tolerant services , 2005, SOSP '05.

[5]  Jon Howell,et al.  The SMART way to migrate replicated stateful services , 2006, EuroSys.

[6]  Tushar Deepak Chandra,et al.  Paxos Made Live - An Engineering Perspective (2006 Invited Talk) , 2007 .

[7]  Robert Griesemer,et al.  Paxos made live: an engineering perspective , 2007, PODC '07.

[8]  Ramakrishna Kotla,et al.  Zyzzyva , 2007, SOSP.

[9]  Miguel Correia,et al.  DepSpace: a byzantine fault-tolerant coordination service , 2008, Eurosys '08.

[10]  John Lane,et al.  Byzantine replication under attack , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[11]  Sangmin Lee,et al.  Upright cluster services , 2009, SOSP '09.

[12]  Robbert van Renesse,et al.  Toward a cloud computing research agenda , 2009, SIGA.

[13]  Michael Dahlin,et al.  Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults , 2009, NSDI.

[14]  Miguel Correia,et al.  Spin One's Wheels? Byzantine Fault Tolerance with a Spinning Primary , 2009, 2009 28th IEEE International Symposium on Reliable Distributed Systems.

[15]  Leslie Lamport,et al.  Reconfiguring a state machine , 2010, SIGA.

[16]  Marko Vukolic,et al.  The byzantine empire in the intercloud , 2010, SIGA.

[17]  C. Cachin Yet Another Visit to Paxos , 2010 .

[18]  Garth A. Gibson,et al.  dBug: Systematic Evaluation of Distributed Systems , 2010, SSV.

[19]  Marcos K. Aguilera,et al.  Reconfiguring Replicated Atomic Storage: A Tutorial , 2013, Bull. EATCS.

[20]  John Lane,et al.  Prime: Byzantine Replication under Attack , 2011, IEEE Transactions on Dependable and Secure Computing.

[21]  Rodrigo Rodrigues,et al.  Efficient middleware for byzantine fault tolerant database replication , 2011, EuroSys '11.

[22]  Alysson Neves Bessani From Byzantine fault tolerance to intrusion tolerance (a position paper) , 2011, 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks Workshops (DSN-W).

[23]  Neeraj Suri,et al.  Efficient model checking of fault-tolerant distributed protocols , 2011, 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN).

[24]  Miguel Correia,et al.  Practical Hardening of Crash-Tolerant Systems , 2012, USENIX Annual Technical Conference.

[25]  Johannes Behl,et al.  CheapBFT: resource-efficient byzantine fault tolerance , 2012, EuroSys '12.

[26]  Alysson Neves Bessani,et al.  From Byzantine Consensus to BFT State Machine Replication: A Latency-Optimal Transformation , 2012, 2012 Ninth European Dependable Computing Conference.

[27]  Miguel Correia,et al.  On the Efficiency of Durable State Machine Replication , 2013, USENIX Annual Technical Conference.

[28]  Miguel Correia,et al.  Efficient Byzantine Fault-Tolerance , 2013, IEEE Transactions on Computers.

[29]  Alysson Neves Bessani,et al.  An intrusion-tolerant firewall design for protecting SIEM systems , 2013, 2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop (DSN-W).

[30]  Rajeev Gandhi,et al.  Experiences with Fault-Injection in a Byzantine Fault-Tolerant Protocol , 2013, Middleware.

[31]  André Schiper,et al.  Achieving High-Throughput State Machine Replication in Multi-core Systems , 2013, 2013 IEEE 33rd International Conference on Distributed Computing Systems.

[32]  Alysson Neves Bessani,et al.  Analysis of operating system diversity for intrusion tolerance , 2014, Softw. Pract. Exp..

[33]  Marko Vukolic,et al.  The Next 700 BFT Protocols , 2015, ACM Trans. Comput. Syst..