Merging Relational Views: A Minimization Approach

Schema integration is the procedure to integrate several inter-related schemas to produce a unified schema, called the mediated schema. There are two major flavors of schema integration: data integration and view integration. The former deals with integrating multiple data sources to create a mediated query interface, while the latter aims at constructing a base schema, capable of supporting the source schemas as views. Our work builds upon previous approaches that address relational view integration using logical mapping constraints. Given a set of data dependencies over the source schemas as input, our approach produces a minimal information-preserving mediated schema with constraints, and it generates output mappings defining the source schemas as views. We extend previous approaches in several aspects. First, schema minimization is performed within a scope of Project-Join views that are information preserving and produce a smaller mediated schema than in existing work. Second, the input schema mapping language is expressive enough for not only query containment but also query equivalence. Third, source integrity constraints can be seamlessly incorporated into reasoning. Last but not least, we have evaluated our implementation over both real world data sets and a schema mapping benchmark.

[1]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[2]  Erhard Rahm,et al.  Supporting executable mappings in model management , 2005, SIGMOD '05.

[3]  Marco A. Casanova,et al.  Towards a sound view integration methodology , 1983, PODS.

[4]  Joachim Biskup,et al.  A formal view integration method , 1986, SIGMOD '86.

[5]  Phokion G. Kolaitis,et al.  Interactive generation of integrated schemas , 2008, SIGMOD Conference.

[6]  Stefano Spaccapietra,et al.  View Integration: A Step Forward in Solving Structural Conflicts , 1994, IEEE Trans. Knowl. Data Eng..

[7]  Philip A. Bernstein,et al.  Model management 2.0: manipulating richer mappings , 2007, SIGMOD '07.

[8]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[9]  Ronald Fagin Inverting schema mappings , 2007 .

[10]  Philip A. Bernstein,et al.  Schema merging and mapping creation for relational sources , 2008, EDBT '08.

[11]  Alin Deutsch,et al.  Query reformulation with constraints , 2006, SGMD.

[12]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[13]  Christoph Quix,et al.  Generic Schema Merging , 2007, CAiSE.

[14]  Sandra Geisler,et al.  Automatic schema merging using mapping constraints among incomplete sources , 2010, CIKM.

[15]  Sandra Geisler,et al.  Automatic generation of mediated schemas through reasoning over data dependencies , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[16]  Ioana Stanoi,et al.  Top-k generation of integrated schemas based on directed and weighted correspondences , 2009, SIGMOD Conference.

[17]  Renée J. Miller,et al.  The Use of Information Capacity in Schema Integration and Translation , 1993, VLDB.

[18]  Stefano Spaccapietra,et al.  Model independent assertions for integration of heterogeneous schemas , 1992, The VLDB Journal.

[19]  Sergey Melnik,et al.  Generic Model Management , 2004, Lecture Notes in Computer Science.

[20]  Wang Chiew Tan,et al.  STBenchmark: towards a benchmark for mapping systems , 2008, Proc. VLDB Endow..

[21]  Alon Y. Halevy,et al.  Bootstrapping pay-as-you-go data integration systems , 2008, SIGMOD Conference.

[22]  Marcelo Arenas,et al.  Foundations of schema mapping management , 2010, PODS '10.

[23]  Andrew B. Whinston,et al.  Model management , 1994 .