MonkeyDB: effectively testing correctness under weak isolation levels

Modern applications, such as social networking systems and e-commerce platforms are centered around using large-scale storage systems for storing and retrieving data. In the presence of concurrent accesses, these storage systems trade off isolation for performance. The weaker the isolation level, the more behaviors a storage system is allowed to exhibit and it is up to the developer to ensure that their application can tolerate those behaviors. However, these weak behaviors only occur rarely in practice and outside the control of the application, making it difficult for developers to test the robustness of their code against weak isolation levels. This paper presents MonkeyDB, a mock storage system for testing storage-backed applications. MonkeyDB supports a key-value interface as well as SQL queries under multiple isolation levels. It uses a logical specification of the isolation level to compute, on a read operation, the set of all possible return values. MonkeyDB then returns a value randomly from this set. We show that MonkeyDB provides good coverage of weak behaviors, which is complete in the limit. We test a variety of applications for assertions that fail only under weak isolation. MonkeyDB is able to break each of those assertions in a small number of attempts.

[1]  Hongseok Yang,et al.  'Cause I'm strong enough: Reasoning about consistency choices in distributed systems , 2016, POPL.

[2]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[3]  Analysing Snapshot Isolation , 2018 .

[4]  Sérgio Duarte,et al.  Putting consistency back into eventual consistency , 2015, EuroSys.

[5]  Constantin Enea,et al.  Checking Robustness Against Snapshot Isolation , 2019, CAV.

[6]  Christos H. Papadimitriou,et al.  The serializability of concurrent database updates , 1979, JACM.

[7]  Ori Lahav,et al.  Effective stateless model checking for C/C++ concurrency , 2017, Proc. ACM Program. Lang..

[8]  Constantin Enea,et al.  On the complexity of checking transactional consistency , 2019, Proc. ACM Program. Lang..

[9]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[10]  Suresh Jagannathan,et al.  CLOTHO: directed test generation for weakly consistent database systems , 2019, Proc. ACM Program. Lang..

[11]  Carlo Curino,et al.  OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases , 2013, Proc. VLDB Endow..

[12]  Annette Bieniusa,et al.  Antidote: the highly-available geo-replicated database with strongest guarantees , 2016 .

[13]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[14]  Anna Liu,et al.  Distributed systems testing , 2000, Proceedings of the 33rd Annual Hawaii International Conference on System Sciences.

[15]  Parosh Aziz Abdulla,et al.  Stateless Model Checking for TSO and PSO , 2015, TACAS.

[16]  Alexey Gotsman,et al.  Robustness against Consistency Models with Atomic Visibility , 2016, CONCUR.

[17]  Alexey Gotsman,et al.  A Framework for Transactional Consistency Models with Atomic Visibility , 2015, CONCUR.

[18]  Suresh Jagannathan,et al.  Automated Detection of Serializability Violations under Weak Consistency , 2018, CONCUR.

[19]  Michael J. Freedman,et al.  Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[20]  Andrew Pavlo,et al.  What Are We Doing With Our Lives?: Nobody Cares About Our Concurrency Control Research , 2017, SIGMOD Conference.

[21]  José Rolando,et al.  Microsoft Azure Cosmos DB revealed : a multi-modal database designed for the Cloud , 2018 .

[22]  Peter Bailis,et al.  ACIDRain: Concurrency-Related Attacks on Database-Backed Web Applications , 2017, SIGMOD Conference.

[23]  Alastair F. Donaldson,et al.  Dynamic race detection for C++11 , 2017, POPL.

[24]  Sanjeev Kumar,et al.  Existential consistency: measuring and understanding consistency at Facebook , 2015, SOSP.

[25]  Yang Wang,et al.  IsoDiff: Debugging Anomalies Caused by Weak Isolation , 2020, Proc. VLDB Endow..

[26]  Brian Demsky,et al.  CDSchecker: checking concurrent data structures written with C/C++ atomics , 2013, OOPSLA.

[27]  Madan Musuvathi,et al.  Fair stateless model checking , 2008, PLDI '08.

[28]  Burcu Kulahcioglu Ozkan,et al.  Verifying Weakly Consistent Transactional Programs Using Symbolic Execution , 2020, NETYS.

[29]  Patrick E. O'Neil,et al.  Generalized isolation level definitions , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[30]  Dennis Shasha,et al.  Making snapshot isolation serializable , 2005, TODS.

[31]  Barbara Liskov,et al.  Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions , 1999 .

[32]  João Leitão,et al.  Automating the Choice of Consistency Levels in Replicated Systems , 2014, USENIX Annual Technical Conference.

[33]  Peter Müller,et al.  Static serializability analysis for causal consistency , 2018, PLDI.

[34]  Constantin Enea,et al.  MonkeyDB: Effectively Testing Correctness against Weak Isolation Levels , 2021, ArXiv.

[35]  Suresh Jagannathan,et al.  Declarative programming over eventually consistent data stores , 2015, PLDI.

[36]  Constantin Enea,et al.  Robustness Against Transactional Causal Consistency , 2019, CONCUR.

[37]  Peter Müller,et al.  Serializability for eventual consistency: criterion, analysis, and applications , 2017, POPL.

[38]  Sreeja Nair,et al.  Proving the Safety of Highly-Available Distributed Objects , 2020, ESOP.

[39]  Suresh Jagannathan,et al.  Semantics, Specification, and Bounded Verification of Concurrent Libraries in Replicated Systems , 2020, CAV.

[40]  S. Sudarshan,et al.  Automating the Detection of Snapshot Isolation Anomalies , 2007, VLDB.

[41]  Hui Ding,et al.  TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.

[42]  Suresh Jagannathan,et al.  Safe replication through bounded concurrency verification , 2018, Proc. ACM Program. Lang..