Multi-schema-version data management: data independence in the twenty-first century

Agile software development allows us to continuously evolve and run a software system. However, this is not possible in databases, as established methods are very expensive, error-prone, and far from agile. We present InVerDa, a multi-schema-version database management system (MSVDB) for agile database development. MSVDBs realize co-existing schema versions within one database, where each schema version behaves like a regular single-schema database and write operations are propagated between schema versions. Developers use a relationally complete and bidirectional database evolution language (BiDEL) to easily evolve existing schema versions to new ones. BiDEL scripts are more robust, orders of magnitude shorter, and cause only a small performance overhead compared to handwritten SQL scripts. We formally guarantee data independence: no matter how the data of the co-existing schema versions is physically materialized, each schema version is guaranteed to behave like a regular database. Since, the chosen physical materialization significantly determines the overall performance, we equip database administrators with an advisor that proposes an optimized materialization for the current workload, which can improve the performance by orders of magnitude compared to naïve solutions. To our best knowledge, we are the first to facilitate agile evolution of production databases with full support of co-existing schema versions and formally guaranteed data independence.

[1]  Apostolos V. Zarras,et al.  Open-Source Databases: Within, Outside, or Beyond Lehman's Laws of Software Evolution? , 2014, CAiSE.

[2]  John F. Roddick,et al.  SQL/SE: a query language extension for databases supporting schema evolution , 1992, SGMD.

[3]  Erhard Rahm,et al.  An online bibliography on schema evolution , 2006, SGMD.

[4]  Wolfgang Lehner,et al.  CoDEL - A Relationally Complete Language for Database Evolution , 2015, ADBIS.

[5]  Ashish Gupta,et al.  Materialized views: techniques, implementations, and applications , 1999 .

[6]  Carlo Curino,et al.  PRIMA: archiving and querying historical data with evolving schemas , 2009, SIGMOD Conference.

[7]  Surajit Chaudhuri,et al.  Materialized view and index selection tool for Microsoft SQL server 2000 , 2001, SIGMOD '01.

[8]  Carlo Curino,et al.  Automating the database schema evolution process , 2012, The VLDB Journal.

[9]  C. Floudas Handbook of Test Problems in Local and Global Optimization , 1999 .

[10]  Wolfgang Lehner,et al.  Living in Parallel Realities: Co-Existing Schema Versions with a Bidirectional Database Evolution Language , 2017, SIGMOD Conference.

[11]  Carlo Curino,et al.  Schema Evolution in Wikipedia - Toward a Web Information System Benchmark , 2008, ICEIS.

[12]  Carlo Curino,et al.  How Clean Is Your Sandbox? - Towards a Unified Theoretical Framework for Incremental Bidirectional Transformations , 2012, ICMT@TOOLS.

[13]  Martin Hofmann,et al.  Symmetric lenses , 2011, POPL '11.

[14]  Surajit Chaudhuri,et al.  Automated Selection of Materialized Views and Indexes in SQL Databases , 2000, VLDB.

[15]  Meenakshi Arora,et al.  Schema Evolution for Data Warehouse: A Survey , 2011 .

[16]  Kai Herrmann,et al.  Multi-Schema-Version Data Management , 2017 .

[17]  Jeffrey F. Naughton,et al.  Towards Predicting Query Execution Time for Concurrent and Dynamic Database Workloads , 2013, Proc. VLDB Endow..

[18]  James McKinna Complements Witness Consistency , 2016, Bx@ETAPS.

[19]  John F. Roddick,et al.  A survey of schema versioning issues for database systems , 1995, Inf. Softw. Technol..

[20]  Krithi Ramamritham,et al.  Materialized view selection and maintenance using multi-query optimization , 2000, SIGMOD '01.

[21]  Peter J. Bentley,et al.  CREATIVE EVOLUTIONARY SYSTEMS , 2001 .

[22]  Philip A. Bernstein,et al.  Model management 2.0: manipulating richer mappings , 2007, SIGMOD '07.

[23]  Rafal A. Angryk,et al.  Minimal data sets vs. synchronized data copies in a schema and data versioning system , 2011, PIKM '11.

[24]  Eladio Domínguez,et al.  MeDEA: A database evolution architecture with traceability , 2008, Data Knowl. Eng..

[25]  Carlo Curino,et al.  Graceful database schema evolution: the PRISM workbench , 2008, Proc. VLDB Endow..