Turtle Consensus: Moving Target Defense for Consensus

Consensus is a basic building block in middleware configuration services [4, 18]. While such services are designed to tolerate crash failures in asynchronous settings, they may not stand up well to Denial-of-Service (DoS) attacks. Specifically, malicious clients can carefully craft workloads that substantially degrade the performance of many state-of-the-art consensus protocols. By exploiting protocol-specific vulnerabilities, attackers can constantly force the protocol participants to slow execution paths [8]. In this paper, we investigate designing consensus protocols that provide acceptable performance under DoS attacks that aim to saturate the bandwidth of protocol participants. We propose a new asynchronous consensus protocol that we call Turtle Consensus. Turtle Consensus employs previously proposed crash-tolerant consensus protocols and exploits their diverse characteristics by switching between protocols from round to round. Some protocols are fast under benign conditions but their performance suffers greatly under attack. Other protocols may not be as fast under benign conditions, but their performance may actually benefit from naive attacks. By reconfiguring the consensus protocol on-the-fly we can achieve the best of both worlds: excellent performance in benign scenarios and acceptable performance while under attack, even if the client workload is high. We evaluate Turtle Consensus against adversarial scenarios where at most one process may fail and show that we can achieve better performance than existing crash-tolerant protocols under attack.

[1]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[2]  Mark Bickford,et al.  Nysiad: Practical Protocol Transformation to Tolerate Byzantine Failures , 2008, NSDI.

[3]  John K. Ousterhout,et al.  In Search of an Understandable Consensus Algorithm , 2014, USENIX ATC.

[4]  Michael Dahlin,et al.  Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults , 2009, NSDI.

[5]  Atul Singh,et al.  BFT Protocols Under Fire , 2008, NSDI.

[6]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[7]  Ramakrishna Kotla,et al.  Zyzzyva , 2007, SOSP.

[8]  Jialin Li,et al.  Designing Distributed Systems Using Approximate Synchrony in Data Center Networks , 2015, NSDI.

[9]  Liuba Shrira,et al.  HQ replication: a hybrid quorum protocol for byzantine fault tolerance , 2006, OSDI '06.

[10]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[11]  Roy Friedman,et al.  A framework for protocol composition in Horus , 1995, PODC '95.

[12]  Michael J. Fischer,et al.  The Consensus Problem in Unreliable Distributed Systems (A Brief Survey) , 1983, FCT.

[13]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[14]  Robert Griesemer,et al.  Paxos made live: an engineering perspective , 2007, PODC '07.

[15]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[16]  Brett D. Fleisch,et al.  The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[17]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[18]  Pekka Nikander,et al.  DOS-Resistant Authentication with Client Puzzles , 2000, Security Protocols Workshop.

[19]  Marko Vukolic,et al.  The Next 700 BFT Protocols , 2015, ACM Trans. Comput. Syst..

[20]  Luís E. T. Rodrigues,et al.  A Machine Learning Approach to Performance Prediction of Total Order Broadcast Protocols , 2010, 2010 Fourth IEEE International Conference on Self-Adaptive and Self-Organizing Systems.

[21]  Mark Bickford,et al.  Protocol switching: exploiting meta-properties , 2001, Proceedings 21st International Conference on Distributed Computing Systems Workshops.

[22]  Danny Dolev,et al.  On the minimal synchronism needed for distributed consensus , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[23]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[24]  Mark Bickford,et al.  Investigating correct-by-construction attack-tolerant systems , 2011 .

[25]  Michael Ben-Or,et al.  Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.

[26]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[27]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[28]  Keith Marzullo,et al.  Mencius: Building Efficient Replicated State Machine for WANs , 2008, OSDI.

[29]  David G. Andersen,et al.  There is more consensus in Egalitarian parliaments , 2013, SOSP.

[30]  Marcos K. Aguilera,et al.  Randomization and Failure Detection: A Hybrid Approach to Solve Consensus , 1996, WDAG.

[31]  Robbert van Renesse,et al.  Building Adaptive Systems Using Ensemble , 1998, Softw. Pract. Exp..

[32]  Li Gong,et al.  Implementing Adaptive Fault-Tolerant Services for Hybrid Faults , 2007 .