Version Reconciliation for Collaborative Databases

We propose MindPalace, a prototype of a versioned database for efficient collaborative data management. MindPalace supports offline collaboration, where users work independently without real-time correspondence. The core of MindPalace is a critical step of offline collaboration: reconciling divergent branches made by simultaneous data manipulation. We formalize the concept of auto-mergeability, a condition under which branches may be reconciled without human intervention, and propose an efficient framework for determining whether two branches are auto-mergeable and identifying particular records for manual reconciliation.

[1]  Cheng Li,et al.  Fine-grained consistency for geo-replicated systems , 2018, USENIX Annual Technical Conference.

[2]  Jeffrey Xu Yu,et al.  My Weak Consistency is Strong , 2017, CIDR.

[3]  Chao Xie,et al.  High-performance ACID via modular concurrency control , 2015, SOSP.

[4]  Jonathan Goldstein,et al.  MTCache: transparent mid-tier database caching in SQL server , 2004, Proceedings. 20th International Conference on Data Engineering.

[5]  Jennifer Widom,et al.  Adaptive precision setting for cached approximate values , 2001, SIGMOD '01.

[6]  Philip S. Yu,et al.  Divergence Control Algorithms for Epsilon Serializability , 1997, IEEE Trans. Knowl. Data Eng..

[7]  Bettina Kemme,et al.  Consistency anomalies in multi-tier architectures: automatic detection and prevention , 2013, The VLDB Journal.

[8]  Joseph M. Hellerstein,et al.  The declarative imperative: experiences and conjectures in distributed logic , 2010, SGMD.

[9]  Martin C. Rinard,et al.  Commutativity analysis: a new analysis technique for parallelizing compilers , 1997, TOPL.

[10]  Kian-Lee Tan,et al.  Transaction Healing: Scaling Optimistic Concurrency Control on Multicores , 2016, SIGMOD Conference.

[11]  Bettina Kemme,et al.  How consistent is your cloud application? , 2012, SoCC '12.

[12]  William E. Weihl,et al.  Local atomicity properties: modular concurrency control for abstract data types , 1989, TOPL.

[13]  Sebastian Burckhardt,et al.  Concurrent programming with revisions and isolation types , 2010, OOPSLA.

[14]  Jennifer Widom,et al.  Offering a Precision-Performance Tradeoff for Aggregation Queries over Replicated Data , 2000, VLDB.

[15]  Rachid Guerraoui,et al.  Democratizing transactional programming , 2014, CACM.

[16]  Philip S. Yu,et al.  Divergence control for epsilon-serializability , 1992, [1992] Eighth International Conference on Data Engineering.

[17]  Shiyong Lu,et al.  Semantic conditions for correctness at different isolation levels , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[18]  Beng Chin Ooi,et al.  ForkBase: An Efficient Storage Engine for Blockchain and Forkable Applications , 2018, Proc. VLDB Endow..

[19]  Philip A. Bernstein,et al.  Relaxed-currency serializability for middle-tier caching and replication , 2006, SIGMOD Conference.

[20]  David Maier,et al.  Blazes: Coordination Analysis and Placement for Distributed Programs , 2017, ACM Trans. Database Syst..

[21]  Cody Cutler,et al.  Phase Reconciliation for Contended In-Memory Transactions , 2014, OSDI.

[22]  Sebastian Burckhardt,et al.  Replicated data types: specification, verification, optimality , 2014, POPL.

[23]  Peter Bailis,et al.  Coordination Avoidance in Distributed Databases , 2015 .

[24]  Ali Ghodsi,et al.  Coordination Avoidance in Database Systems , 2014, Proc. VLDB Endow..

[25]  Todd L. Veldhuizen Transaction Repair: Full Serializability Without Locks , 2014, ArXiv.

[26]  Hongseok Yang,et al.  'Cause I'm strong enough: Reasoning about consistency choices in distributed systems , 2016, POPL.

[27]  The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors , 2015, TOCS.

[28]  Aditya G. Parameswaran,et al.  DataHub: Collaborative Data Science & Dataset Version Management at Scale , 2014, CIDR.

[29]  J. Banerjee,et al.  Precision locks , 1981, SIGMOD '81.

[30]  Maurice Herlihy,et al.  Transactional boosting: a methodology for highly-concurrent transactional objects , 2008, PPoPP.

[31]  Raghu Ramakrishnan,et al.  Caching with 'Good Enough' Currency, Consistency, and Completeness , 2005, VLDB.

[32]  Austin T. Clements,et al.  The scalable commutativity rule: designing scalable software for multicore processors , 2013, SOSP.

[33]  Peter Müller,et al.  Serializability for eventual consistency: criterion, analysis, and applications , 2017, POPL.

[34]  Gustavo Alonso,et al.  Consistency Rationing in the Cloud: Pay only when it matters , 2009, Proc. VLDB Endow..

[35]  Murray Cole,et al.  Towards a compiler analysis for parallel algorithmic skeletons , 2018, CC.

[36]  Aditya G. Parameswaran,et al.  OrpheusDB: Bolt-on Versioning for Relational Databases , 2017, Proc. VLDB Endow..

[37]  Roman Manevich,et al.  JANUS: exploiting parallelism via hindsight , 2012, PLDI '12.

[38]  Jonathan Goldstein,et al.  Relaxed currency and consistency: how to say "good enough" in SQL , 2004, SIGMOD '04.

[39]  M. Tamer Özsu,et al.  ConfluxDB: Multi-Master Replication for Partitioned Snapshot Isolation Databases , 2014, Proc. VLDB Endow..

[40]  Aditya G. Parameswaran,et al.  Decibel: The Relational Dataset Branching System , 2016, Proc. VLDB Endow..

[41]  Marcos K. Aguilera,et al.  Transaction chains: achieving serializability with low latency in geo-distributed storage systems , 2013, SOSP.

[42]  Bettina Kemme,et al.  Real-time quantification and classification of consistency anomalies in multi-tier architectures , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[43]  Jeffrey Xu Yu,et al.  RushMon: Real-time Isolation Anomalies Monitoring , 2018, SIGMOD Conference.

[44]  Joseph M. Hellerstein,et al.  Interactive checks for coordination avoidance , 2018, Proc. VLDB Endow..

[45]  Lorenzo Alvisi,et al.  Seeing is Believing: A Client-Centric Specification of Database Isolation , 2017, PODC.

[46]  Jonathan Goldstein,et al.  Support for relaxed currency and consistency constraints in MTCache , 2004, ACM SIGMOD Conference.

[47]  Keshav Pingali,et al.  Exploiting the commutativity lattice , 2011, PLDI '11.

[48]  João Leitão,et al.  Automating the Choice of Consistency Levels in Replicated Systems , 2014, USENIX Annual Technical Conference.

[49]  Rodrigo Rodrigues,et al.  IPA: Invariant-preserving Applications for Weakly consistent Replicated Databases , 2018, Proc. VLDB Endow..

[50]  Joseph M. Hellerstein,et al.  Consistency Analysis in Bloom: a CALM and Collected Approach , 2011, CIDR.

[51]  Nathan Clark,et al.  Commutativity analysis for software parallelization: letting program transformations see the big picture , 2009, ASPLOS.

[52]  Calton Pu Relaxing the limitations of serializable transactions in distributed systems , 1993, OPSR.

[53]  Haibo Chen,et al.  Scaling Multicore Databases via Constrained Parallel Execution , 2016, SIGMOD Conference.

[54]  Suresh Jagannathan,et al.  Declarative programming over eventually consistent data stores , 2015, PLDI.

[55]  Cheng Li,et al.  Making geo-replicated systems fast as possible, consistent when necessary , 2012, OSDI 2012.

[56]  William E. Weihl,et al.  Commutativity-based concurrency control for abstract data types , 1988, [1988] Proceedings of the Twenty-First Annual Hawaii International Conference on System Sciences. Volume II: Software track.

[57]  Lorenzo Alvisi,et al.  TARDiS: A Branch-and-Merge Approach To Weak Consistency , 2016, SIGMOD Conference.

[58]  Calton Pu Relaxing the limitations of serializable transactions in distributed systems , 1992, EW 5.

[59]  Marc Shapiro,et al.  Conflict-Free Replicated Data Types , 2011, SSS.

[60]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.