Deciding When to Trade Data Freshness for Performance in MongoDB-as-a-Service

MongoDB is a popular document store that is also available as a cloud-hosted service. MongoDB internally deploys primary-copy asynchronous replication, and it allows clients to vary the Read Preference, so reads can deliberately be directed to secondaries rather than the primary site. Doing this can sometimes improve performance, but the returned data might be stale, whereas the primary always returns the freshest data value. While state-of-practice is for programmers to decide where to direct the reads at application development time, they do not have full understanding then of workload or hardware capacity. It should be better to choose the appropriate Read Preference setting at runtime, as we describe in this paper.We show how a system can detect when the primary copy is saturated in MongoDB-as-a-Service, and use this to choose where reads should be done to improve overall performance. Our approach is aimed at a cloud-consumer; it assumes access to only the limited diagnostic data provided to clients of the hosted service.

[1]  Asya Kamsky,et al.  Adapting TPC-C Benchmark to Measure Performance of Multi-Document Transactions in MongoDB , 2019, Proc. VLDB Endow..

[2]  David Bermbach,et al.  Benchmarking Eventual Consistency: Lessons Learned from Long-Term Experimental Studies , 2014, 2014 IEEE International Conference on Cloud Engineering.

[3]  Marcos K. Aguilera,et al.  Consistency-based service level agreements for cloud storage , 2013, SOSP.

[4]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[5]  Kevin Lee,et al.  Data Consistency Properties and the Trade-offs in Commercial Cloud Storage: the Consumers' Perspective , 2011, CIDR.

[6]  Raghu Ramakrishnan,et al.  Caching with 'Good Enough' Currency, Consistency, and Completeness , 2005, VLDB.

[7]  Daniel J. Abadi,et al.  Consistency Tradeoffs in Modern Distributed Database System Design: CAP is Only Part of the Story , 2012, Computer.

[8]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[9]  Kimberly Keeton,et al.  LazyBase: trading freshness for performance in a scalable database , 2012, EuroSys '12.

[10]  David Bermbach Benchmarking eventually consistent distributed storage systems , 2014 .

[11]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[12]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[13]  Heiko Schuldt,et al.  FAS - A Freshness-Sensitive Coordination Middleware for a Cluster of OLAP Components , 2002, VLDB.

[14]  David Bermbach,et al.  Eventual consistency: How soon is eventual? An evaluation of Amazon S3's consistency behavior , 2011, MW4SOC '11.

[15]  Douglas B. Terry,et al.  A Self-Configurable Geo-Replicated Cloud Storage System , 2014, OSDI.

[16]  William Schultz,et al.  Tunable Consistency in MongoDB , 2019, Proc. VLDB Endow..

[17]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[18]  Alan Fekete,et al.  Data Consistency Properties of Document Store as a Service (DSaaS): Using MongoDB Atlas as an Example , 2018, TPCTC.