An experimental study on tuning the consistency of NoSQL systems

The eventual consistency model has been widely adopted in NoSQL systems. By tolerating weak consistency, these systems attain high throughput and availability while sustaining side effects on user experience and developer friendliness. Trading off consistency from latency has been a common consensus. An important but widely ignored problem is how to control the consistency of an existing system without the necessity of modifying the system implementation. In this paper, we present a systematic study on the client‐centric consistency of a NoSQL system, Cassandra, and disclose how the consistency can be substantially enhanced by tuning the system configurations when users use partial quorum settings. We use session guarantee as the consistency model and analyze the root cause of consistency violation, testifying that the length of the write queue is a reasonable indicator for consistency quantification. For inconsistency mitigation, we show through extensive experiments how the consistency is affected by the read and write processes of the system, and how the consistency can be improved by tuning system configurations. In particular, we provide developers with recommended configurations by changing the write thread number and the fine‐grained quorum setting for enhanced consistency control. Because consistency anomalies do not occur uniformly, we discuss how to stabilize the consistency by analyzing system logs.

[1]  Ion Stoica,et al.  Probabilistically Bounded Staleness for Practical Partial Quorums , 2012, Proc. VLDB Endow..

[2]  Ion Stoica,et al.  Quantifying eventual consistency with PBS , 2014, CACM.

[3]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[4]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[5]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[6]  Sanjeev Kumar,et al.  Existential consistency: measuring and understanding consistency at Facebook , 2015, SOSP.

[7]  Ali Ghodsi,et al.  Bolt-on causal consistency , 2013, SIGMOD '13.

[8]  Yu Jiang,et al.  Design and Optimization of Multiclocked Embedded Systems Using Formal Techniques , 2015, IEEE Transactions on Industrial Electronics.

[9]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[10]  Sherif Sakr,et al.  Towards Comprehensive Measurement of Consistency Guarantees for Cloud-Hosted Data Storage Services , 2013, TPCTC.

[11]  Marvin Theimer,et al.  Session guarantees for weakly consistent replicated data , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[12]  Joseph M. Hellerstein,et al.  Consistency Analysis in Bloom: a CALM and Collected Approach , 2011, CIDR.

[13]  Rui Liu,et al.  DAX: A Widely Distributed Multi-tenant Storage Service for DBMS Hosting , 2013, Proc. VLDB Endow..

[14]  Wojciech M. Golab,et al.  Toward a Principled Framework for Benchmarking Consistency , 2012, HotDep.

[15]  David Bermbach,et al.  Eventual consistency: How soon is eventual? An evaluation of Amazon S3's consistency behavior , 2011, MW4SOC '11.

[16]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[17]  Philip S. Yu,et al.  RECODS: Replica consistency-on-demand store , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[18]  Ali Ghodsi,et al.  Eventual consistency today: limitations, extensions, and beyond , 2013, CACM.

[19]  Hua Fan,et al.  Understanding the Causes of Consistency Anomalies in Apache Cassandra , 2015, Proc. VLDB Endow..

[20]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[21]  Indranil Gupta,et al.  Client-Centric Benchmarking of Eventual Consistency for Cloud Storage Systems , 2013, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[22]  David E. Culler,et al.  SEDA: an architecture for well-conditioned, scalable internet services , 2001, SOSP.

[23]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[24]  Gil Neiger,et al.  Causal memory: definitions, implementation, and programming , 1995, Distributed Computing.

[25]  Jianmin Wang,et al.  Client-centric consistency formalization and verification for system with large-scale distributed data storage , 2010, Future Gener. Comput. Syst..

[26]  Yu Jiang,et al.  Design of Mixed Synchronous/Asynchronous Systems with Multiple Clocks , 2015, IEEE Transactions on Parallel and Distributed Systems.

[27]  Hailong Sun,et al.  Consistency or latency? A quantitative analysis of replication systems based on replicated state machines , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[28]  Philip A. Bernstein,et al.  Rethinking eventual consistency , 2013, SIGMOD '13.

[29]  Daniel J. Abadi,et al.  Consistency Tradeoffs in Modern Distributed Database System Design: CAP is Only Part of the Story , 2012, Computer.

[30]  Marcos K. Aguilera,et al.  Consistency-based service level agreements for cloud storage , 2013, SOSP.

[31]  Kevin Lee,et al.  Data Consistency Properties and the Trade-offs in Commercial Cloud Storage: the Consumers' Perspective , 2011, CIDR.