Strong and Efficient Consistency with Consistency-aware Durability

We introduce consistency-aware durability or CAD, a new approach to durability in distributed storage that enables strong consistency while delivering high performance. We demonstrate the efficacy of this approach by designing cross-client monotonic reads, a novel and strong consistency property that provides monotonic reads across failures and sessions in leader-based systems. We build ORCA, a modified version of ZooKeeper that implements CAD and cross-client monotonic reads. We experimentally show that ORCA provides strong consistency while closely matching the performance of weakly consistent ZooKeeper. Compared to strongly consistent ZooKeeper, ORCA provides significantly higher throughput (1.8 – 3.3×), and notably reduces latency, sometimes by an order of magnitude in geo-distributed settings.

[1]  Barbara Liskov,et al.  Viewstamped Replication Revisited , 2012 .

[2]  Jialin Li,et al.  Designing Distributed Systems Using Approximate Synchrony in Data Center Networks , 2015, NSDI.

[3]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[4]  Satoshi Matsushita,et al.  Implementing linearizability at large scale and low latency , 2015, SOSP.

[5]  Jason Flinn,et al.  Rethink the sync , 2006, OSDI '06.

[6]  Michael J. Freedman,et al.  Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[7]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[8]  Andrea C. Arpaci-Dusseau,et al.  Correlated Crash Vulnerabilities , 2016, OSDI.

[9]  David G. Andersen,et al.  There is more consensus in Egalitarian parliaments , 2013, SOSP.

[10]  Jaemin Jung,et al.  Barrier-Enabled IO Stack for Flash Storage , 2018, FAST.

[11]  Eyal de Lara,et al.  Toward Session Consistency for the Edge , 2018, HotEdge.

[12]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[13]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[14]  Marvin Theimer,et al.  Session guarantees for weakly consistent replicated data , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[15]  Ali Ghodsi,et al.  Bolt-on causal consistency , 2013, SIGMOD '13.

[16]  Andrea C. Arpaci-Dusseau,et al.  Optimistic crash consistency , 2013, SOSP.

[17]  Sanjeev Kumar,et al.  Existential consistency: measuring and understanding consistency at Facebook , 2015, SOSP.

[18]  Eric Eide,et al.  Introducing CloudLab: Scientific Infrastructure for Advancing Cloud Architectures and Applications , 2014, login Usenix Mag..

[19]  Randal C. Burns,et al.  An analytical study of opportunistic lease renewal , 2001, Proceedings 21st International Conference on Distributed Computing Systems.

[20]  Amin Vahdat,et al.  Exploiting a Natural Network Effect for Scalable, Fine-grained Clock Synchronization , 2018, NSDI.

[21]  Jason Flinn,et al.  Tolerating Latency in Replicated State Machines Through Client Speculation , 2009, NSDI.

[22]  Geoffrey H. Kuenning,et al.  Replication Requirements in Mobile Environments , 2001, Mob. Networks Appl..

[23]  Lorenzo Alvisi,et al.  I Can't Believe It's Not Causal! Scalable Causal Consistency with No Slowdown Cascades , 2017, NSDI.

[24]  Andrea C. Arpaci-Dusseau,et al.  Fault-Tolerance, Fast and Slow: Exploiting Failure Asynchrony in Distributed Systems , 2018, OSDI.

[25]  David R. Cheriton,et al.  Leases: an efficient fault-tolerant mechanism for distributed file cache consistency , 1989, SOSP '89.

[26]  Peng Li,et al.  Paxos Replicated State Machines as the Basis of a High-Performance Data Store , 2011, NSDI.

[27]  Kimberly Keeton,et al.  LazyBase: trading freshness for performance in a scalable database , 2012, EuroSys '12.

[28]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[29]  Doug Terry,et al.  Replicated data consistency explained through baseball , 2013, CACM.

[30]  John K. Ousterhout,et al.  In Search of an Understandable Consensus Algorithm , 2014, USENIX ATC.

[31]  Rachid Guerraoui,et al.  Incremental Consistency Guarantees for Replicated Objects , 2016, OSDI.