Parameterized and Runtime-Tunable Snapshot Isolation in Distributed Transactional Key-Value Stores

Several relaxed variants of Snapshot Isolation (SI) have been proposed for improved performance in distributed transactional key-value stores. These relaxed variants, however, provide no specification or control of the severity of the anomalies with respect to SI. They have also been designed to be used statically throughout the whole system life cycle. To overcome these drawbacks, we propose the idea of parameterized and runtime-tunable snapshot isolation. We first define a new transactional consistency model called Relaxed Version Snapshot Isolation (RVSI), which can formally and quantitatively specify the anomalies it may produce with respect to SI. To this end, we decompose SI into three "view properties", for each of which we introduce a parameter to quantify one of three kinds of possible anomalies: k1-BV (k1-version bounded backward view), k2-FV (k2-version bounded forward view), and k3-SV (k3-version bounded snapshot view). We then implement a prototype partitioned replicated distributed transactional key-value store called Chameleon across multiple data centers. While achieving RVSI, Chameleon allows each transaction to dynamically tune its consistency level at runtime. The experiments show that RVSI helps to reduce the transaction abort rates when applications are willing to tolerate certain anomalies. We also evaluate the individual impacts of k1-BV, k2-FV, and k3-SV on reducing the transaction abort rates in various scenarios. We find that it depends on the issue delays between clients and replicas which of k1 and k2 plays a major role in reducing transaction abort rates.

[1]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[2]  Michael J. Freedman,et al.  Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[3]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[4]  Divyakant Agrawal,et al.  Serializability, not Serial: Concurrency Control and Availability in Multi-Datacenter Datastores , 2012, Proc. VLDB Endow..

[5]  Jonathan Goldstein,et al.  Relaxed currency and consistency: how to say "good enough" in SQL , 2004, SIGMOD '04.

[6]  Calton Pu,et al.  A Formal Characterization of Epsilon Serializability , 1995, IEEE Trans. Knowl. Data Eng..

[7]  Frank Dabek,et al.  Large-scale Incremental Processing Using Distributed Transactions and Notifications , 2010, OSDI.

[8]  Divyakant Agrawal,et al.  A Taxonomy of Partitioned Replicated Cloud-based Database Systems , 2015, IEEE Data Eng. Bull..

[9]  Calton Pu,et al.  Replica control in distributed systems: as asynchronous approach , 1991, SIGMOD '91.

[10]  Marcos K. Aguilera,et al.  Transactions with Consistency Choices on Geo-Replicated Cloud Storage , 2013 .

[11]  Marc Shapiro,et al.  Non-monotonic Snapshot Isolation: Scalable and Strong Consistency for Geo-replicated Transactional Systems , 2013, 2013 IEEE 32nd International Symposium on Reliable Distributed Systems.

[12]  Barbara Liskov,et al.  Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions , 1999 .

[13]  Marcos K. Aguilera,et al.  Transactional storage for geo-replicated systems , 2011, SOSP.

[14]  Doug Terry,et al.  Replicated data consistency explained through baseball , 2013, CACM.

[15]  Christos H. Papadimitriou,et al.  The serializability of concurrent database updates , 1979, JACM.

[16]  João Leitão,et al.  Automating the Choice of Consistency Levels in Replicated Systems , 2014, USENIX Annual Technical Conference.

[17]  Kenneth Salem,et al.  Lazy database replication with snapshot isolation , 2006, VLDB.

[18]  Chao Xie,et al.  Salt: Combining ACID and BASE in a Distributed Database , 2014, OSDI.

[19]  Marc Shapiro,et al.  Non-Monotonic Snapshot Isolation , 2013, ArXiv.

[20]  Philip A. Bernstein,et al.  Relaxed-currency serializability for middle-tier caching and replication , 2006, SIGMOD Conference.

[21]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[22]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[23]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[24]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[25]  Anand R. Tripathi,et al.  A transaction model for management of replicated data with multiple consistency levels , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[26]  Cheng Li,et al.  Making geo-replicated systems fast as possible, consistent when necessary , 2012, OSDI 2012.

[27]  Anand R. Tripathi,et al.  Causally Coordinated Snapshot Isolation for Geographically Replicated Data , 2012, 2012 IEEE 31st Symposium on Reliable Distributed Systems.

[28]  Arthur J. Bernstein,et al.  Bounded ignorance in replicated systems , 1991, PODS.

[29]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[30]  Michael J. Freedman,et al.  Stronger Semantics for Low-Latency Geo-Replicated Storage , 2013, NSDI.

[31]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[32]  Fernando Pedone,et al.  Database replication using generalized snapshot isolation , 2005, 24th IEEE Symposium on Reliable Distributed Systems (SRDS'05).