Reverse data exchange: Coping with nulls

An inverse of a schema mapping M is intended to "undo" what M does, thus providing a way to perform "reverse" data exchange. In recent years, three different formalizations of this concept have been introduced and studied, namely, the notions of an inverse of a schema mapping, a quasi-inverse of a schema mapping, and a maximum recovery of a schema mapping. The study of these notions has been carried out in the context in which source instances are restricted to consist entirely of constants, while target instances may contain both constants and labeled nulls. This restriction on source instances is crucial for obtaining some of the main technical results about these three notions, but, at the same time, limits their usefulness, since reverse data exchange naturally leads to source instances that may contain both constants and labeled nulls. We develop a new framework for reverse data exchange that supports source instances that may contain nulls, thus overcoming the semantic mismatch between source and target instances of the previous formalizations. The development of this new framework requires a careful reformulation of all the important notions, including the notions of the identity schema mapping, inverse, and maximum recovery. To this effect, we introduce the notions of extended identity schema mapping, extended inverse, and maximum extended recovery, by making systematic use of the homomorphism relation on instances. We give results concerning the existence of extended inverses and of maximum extended recoveries, and results concerning their applications to reverse data exchange and query answering. Moreover, we show that maximum extended recoveries can be used to capture in a quantitative way the amount of information loss embodied in a schema mapping specified by source-to-target tuple-generating dependencies.

[1]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[2]  Ronald Fagin,et al.  Schema Mapping Evolution Through Composition and Inversion , 2011, Schema Matching and Mapping.

[3]  Paolo Papotti,et al.  Nested mappings: schema mapping reloaded , 2006, VLDB.

[4]  Marcelo Arenas,et al.  Foundations of schema mapping management , 2010, PODS '10.

[5]  Sergey Melnik,et al.  Generic Model Management , 2004, Lecture Notes in Computer Science.

[6]  Ronald Fagin,et al.  Quasi-inverses of schema mappings , 2007, PODS '07.

[7]  Jayant Madhavan,et al.  Composing Mappings Among Data Sources , 2003, VLDB.

[8]  Philip A. Bernstein,et al.  Implementing mapping composition , 2007, The VLDB Journal.

[9]  Alin Deutsch,et al.  Optimization Properties for Classes of Conjunctive Regular Path Queries , 2001, DBPL.

[10]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[11]  Tomasz Imielinski,et al.  Incomplete information and dependencies in relational databases , 1983, SIGMOD '83.

[12]  Ronald Fagin Inverting schema mappings , 2007 .

[13]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[14]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[15]  Ronald Fagin,et al.  Translating Web Data , 2002, VLDB.

[16]  Marcelo Arenas,et al.  The recovery of a schema mapping: bringing exchanged data back , 2008, TODS.

[17]  Ronald Fagin,et al.  Composing schema mappings: second-order dependencies to the rescue , 2004, PODS '04.

[18]  Sergey Melnik,et al.  Generic Model Management: Concepts And Algorithms (Lecture Notes in Computer Science) , 2004 .

[19]  Chen Li,et al.  Data exchange: query answering for incomplete data sources , 2008, Infoscale.

[20]  Marcelo Arenas,et al.  Composition and inversion of schema mappings , 2009, SGMD.

[21]  Catriel Beeri,et al.  A Proof Procedure for Data Dependencies , 1984, JACM.

[22]  Marcelo Arenas,et al.  Inverting Schema Mappings: Bridging the Gap between Theory and Practice , 2009, Proc. VLDB Endow..

[23]  UC Santa Cruz,et al.  A Reverse Data Exchange : Coping with Nulls , 2009 .

[24]  Ronald Fagin,et al.  The structure of inverses in schema mappings , 2010, JACM.

[25]  Philip A. Bernstein,et al.  Applying Model Management to Classical Meta Data Problems , 2003, CIDR.