Highly Available Transactions: Virtues and Limitations

To minimize network latency and remain online during server failures and network partitions, many modern distributed data storage systems eschew transactional functionality, which provides strong semantic guarantees for groups of multiple operations over multiple data items. In this work, we consider the problem of providing Highly Available Transactions (HATs): transactional guarantees that do not suffer unavailability during system partitions or incur high network latency. We introduce a taxonomy of highly available systems and analyze existing ACID isolation and distributed data consistency guarantees to identify which can and cannot be achieved in HAT systems. This unifies the literature on weak transactional isolation, replica consistency, and highly available systems. We analytically and experimentally quantify the availability and performance benefits of HATs---often two to three orders of magnitude over wide-area networks---and discuss their necessary semantic compromises.

[1]  Ali Ghodsi,et al.  Eventual consistency today: limitations, extensions, and beyond , 2013, CACM.

[2]  Sebastian Burckhardt,et al.  Understanding Eventual Consistency , 2013 .

[3]  D. M. Hutton,et al.  The Art of Multiprocessor Programming , 2008 .

[4]  Michael J. Freedman,et al.  Stronger Semantics for Low-Latency Geo-Replicated Storage , 2013, NSDI.

[5]  Irving L. Traiger,et al.  Granularity of Locks and Degrees of Consistency in a Shared Data Base , 1998, IFIP Working Conference on Modelling in Data Base Management Systems.

[6]  Marvin Theimer,et al.  Session guarantees for weakly consistent replicated data , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[7]  Ali Ghodsi,et al.  HAT, Not CAP: Towards Highly Available Transactions , 2013, HotOS.

[8]  Ali Ghodsi,et al.  Eventual Consistency Today: Limitations, Extensions, and Beyond , 2013 .

[9]  Bettina Kemme,et al.  Database replication for clusters of workstations , 2000 .

[10]  Jeffrey Dean,et al.  Designs, Lessons and Advice from Building Large Distributed Systems , 2009 .

[11]  Werner Vogels,et al.  Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability. , 2022 .

[12]  Michael Stonebraker,et al.  H-store: a high-performance, distributed main memory transaction processing system , 2008, Proc. VLDB Endow..

[13]  Peter Bailis,et al.  The network is reliable , 2014, Commun. ACM.

[14]  Daniel J. Abadi,et al.  Calvin: fast distributed transactions for partitioned database systems , 2012, SIGMOD Conference.

[15]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[16]  Pat Helland,et al.  Life beyond Distributed Transactions: an Apostate's Opinion , 2007, CIDR.

[17]  Stefan Savage,et al.  California fault lines: understanding the causes and impact of network failures , 2010, SIGCOMM '10.

[18]  André Schiper,et al.  From group communication to transactions in distributed systems , 1996, CACM.

[19]  Joseph M. Hellerstein,et al.  Consistency Analysis in Bloom: a CALM and Collected Approach , 2011, CIDR.

[20]  James R. Larus,et al.  Orleans: cloud computing for everyone , 2011, SoCC.

[21]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[22]  Brian D. Noble,et al.  Bobtail: Avoiding Long Tails in the Cloud , 2013, NSDI.

[23]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[24]  Jerzy Brzezinski,et al.  From session causality to causal consistency , 2004, 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings..

[25]  Tim Kraska,et al.  Building a database on S3 , 2008, SIGMOD Conference.

[26]  Ali Ghodsi,et al.  Bolt-on causal consistency , 2013, SIGMOD '13.

[27]  Ali Ghodsi,et al.  Highly Available Transactions: Virtues and Limitations , 2013, Proc. VLDB Endow..

[28]  Marc Shapiro,et al.  A comprehensive study of Convergent and Commutative Replicated Data Types , 2011 .

[29]  Rachid Guerraoui,et al.  On transaction liveness in replicated databases , 1997, Proceedings Pacific Rim International Symposium on Fault-Tolerant Systems.

[30]  Kenneth Salem,et al.  Lazy database replication with ordering guarantees , 2004, Proceedings. 20th International Conference on Data Engineering.

[31]  Farnam Jahanian,et al.  Experimental study of Internet stability and backbone failures , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[32]  Jim Gray,et al.  The Transaction Concept: Virtues and Limitations (Invited Paper) , 1981, VLDB.

[33]  Thomas E. Anderson,et al.  F10: A Fault-Tolerant Engineered Network , 2013, NSDI.

[34]  Yasushi Saito,et al.  Optimistic replication , 2005, CSUR.

[35]  Marcos K. Aguilera,et al.  Transactional storage for geo-replicated systems , 2011, SOSP.

[36]  Philip A. Bernstein,et al.  Rethinking eventual consistency , 2013, SIGMOD '13.

[37]  S. Savage,et al.  On Failure in Managed Enterprise Networks , 2012 .

[38]  Eric A. Brewer,et al.  Towards robust distributed systems (abstract) , 2000, PODC '00.

[39]  Arvola Chan,et al.  Implementing Distributed Read-Only Transactions , 1985, IEEE Transactions on Software Engineering.

[40]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[41]  Navendu Jain,et al.  Understanding network failures in data centers: measurement, analysis, and implications , 2011, SIGCOMM.

[42]  Daniel J. Abadi,et al.  Consistency Tradeoffs in Modern Distributed Database System Design: CAP is Only Part of the Story , 2012, Computer.

[43]  Chen-Nee Chuah,et al.  Characterization of Failures in an Operational IP Backbone Network , 2008, IEEE/ACM Transactions on Networking.

[44]  Barbara Liskov,et al.  Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions , 1999 .

[45]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[46]  Philip A. Bernstein,et al.  Site Initialization, Recovery, and Backup in a Distributed Database System , 1984, IEEE Transactions on Software Engineering.

[47]  Hector Garcia-Molina,et al.  Consistency in a partitioned network: a survey , 1985, CSUR.

[48]  FeketeAlan,et al.  Highly available transactions , 2013, VLDB 2013.

[49]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[50]  Peter Bailis,et al.  The network is reliable , 2014 .

[51]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[52]  Gustavo Alonso,et al.  Database replication techniques: a three parameter classification , 2000, Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000.

[53]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[54]  Jim Gray,et al.  The Transaction Concept: Virtues and Limitations (Invited Paper) , 1981, VLDB.

[55]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[56]  Dennis Shasha,et al.  Making snapshot isolation serializable , 2005, TODS.

[57]  Gil Neiger,et al.  Causal memory: definitions, implementation, and programming , 1995, Distributed Computing.

[58]  Divyakant Agrawal,et al.  G-Store: a scalable data store for transactional multi key access in the cloud , 2010, SoCC '10.