Efficient Linearizable Write Operations Using Bounded Global Time Uncertainty

Distributed key-value stores employed in data centers treat each key-value pair as a shared memory register. For fault-tolerance and performance, each key-value pair is replicated. Various models exist for the consistency of data amongst the replicas. While atomic consistency, also known as linearizability, provides the strongest form of consistency for read and write operations, various key-value stores, such as Cassandra, and Dynamo, offer only eventual consistency instead. One main motivation for such a decision is performance degradation when guaranteeing atomic consistency. In this paper, we use time with known bounded uncertainty to improve the performance of write operations, while maintaining atomic consistency. We show how to use the concept of commit wait in a shared memory register to perform a write operation in one phase (message round trip), instead of two. We evaluate the solution experimentally by comparing it to ABD, a well-known algorithm for achieving atomic consistency in an asynchronous network, which uses two phases for write operations. We also compare our protocol to an eventually consistent register. Our experiments show an improved throughput, and lower write latency, compared to the ABD algorithm.

[1]  Kristina Chodorow,et al.  MongoDB: The Definitive Guide , 2010 .

[2]  Ivan Beschastnikh,et al.  Scalable consistency in Scatter , 2011, SOSP.

[3]  Eric A. Brewer,et al.  Towards robust distributed systems (abstract) , 2000, PODC '00.

[4]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[5]  Seif Haridi,et al.  CATS: Linearizability and Partition Tolerance in Scalable and Self-Organizing Key-Value Stores , 2012 .

[6]  Jim Dowling,et al.  Message-Passing Concurrency for Scalable, Stateful, Reconfigurable Middleware , 2012, Middleware.

[7]  Philip A. Bernstein,et al.  Concurrency control in a system for distributed databases (SDD-1) , 1980, TODS.

[8]  Doug Terry,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[9]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[10]  Robert Gruber,et al.  Efficient optimistic concurrency control using loosely synchronized clocks , 1995, SIGMOD '95.

[11]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[12]  Barbara Liskov,et al.  Practical uses of synchronized clocks in distributed systems , 1991, PODC '91.

[13]  J. Chris Anderson,et al.  CouchDB: The Definitive Guide , 2010 .

[14]  Marc Shapiro,et al.  Eventual Consistency , 2009, Encyclopedia of Database Systems.

[15]  Brian F. Cooper Spanner: Google's globally-distributed database , 2013, SYSTOR '13.

[16]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[17]  Cheng Li,et al.  Making geo-replicated systems fast as possible, consistent when necessary , 2012, OSDI 2012.

[18]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.