TRAMP: Understanding the Behavior of Schema Mappings through Provenance

Though partially automated, developing schema mappings remains a complex and potentially error-prone task. In this paper, we present TRAMP (TRAnsformation Mapping Provenance), an extensive suite of tools supporting the debugging and tracing of schema mappings and transformation queries. TRAMP combines and extends data provenance with two novel notions, transformation provenance and mapping provenance, to explain the relationship between transformed data and those transformations and mappings that produced that data. In addition we provide query support for transformations, data, and all forms of provenance. We formally define transformation and mapping provenance, present an efficient implementation of both forms of provenance, and evaluate the resulting system through extensive experiments.

[1]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[2]  Phokion G. Kolaitis,et al.  Laconic Schema Mappings: Computing the Core with SQL Queries , 2009, Proc. VLDB Endow..

[3]  Laura M. Haas,et al.  Schema Mapping as Query Discovery , 2000, VLDB.

[4]  Wang Chiew Tan,et al.  Artemis: A System for Analyzing Missing Answers , 2009, Proc. VLDB Endow..

[5]  Michael J. Carey,et al.  Updates in the AquaLogic Data Services Platform , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[6]  Laura M. Haas,et al.  Clio: Schema Mapping Creation and Data Exchange , 2009, Conceptual Modeling: Foundations and Applications.

[7]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[8]  Wang Chiew Tan,et al.  Debugging schema mappings with routes , 2006, VLDB.

[9]  Gustavo Alonso,et al.  Perm: Processing Provenance and Data on the Same Data Model through Query Rewriting , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[10]  James Cheney,et al.  Provenance in Databases: Why, How, and Where , 2009, Found. Trends Databases.

[11]  PlaleBeth,et al.  A survey of data provenance in e-science , 2005 .

[12]  John Mylopoulos,et al.  Representing and querying data transformations , 2005, 21st International Conference on Data Engineering (ICDE'05).

[13]  Gottfried Vossen,et al.  Towards practical meta-querying , 2002, Inf. Syst..

[14]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[15]  Gustavo Alonso,et al.  Perm: Efficient Provenance Support for Relational Databases , 2010 .

[16]  Paolo Papotti,et al.  Core schema mappings , 2009, SIGMOD Conference.

[17]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[18]  Paolo Papotti,et al.  Nested mappings: schema mapping reloaded , 2006, VLDB.

[19]  Adriane Chapman,et al.  Why Not? , 1965, SIGMOD Conference.

[20]  Renée J. Miller,et al.  Muse: Mapping Understanding and deSign by Example , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[21]  Jennifer Widom,et al.  Lineage tracing in a data warehousing system , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[22]  Wang Chiew Tan,et al.  STBenchmark: towards a benchmark for mapping systems , 2008, Proc. VLDB Endow..

[23]  RahmErhard,et al.  A survey of approaches to automatic schema matching , 2001, VLDB 2001.

[24]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[25]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[26]  Laura M. Haas,et al.  Data-driven understanding and refinement of schema mappings , 2001, SIGMOD '01.

[27]  Jennifer Widom,et al.  An Introduction to ULDBs and the Trio System , 2006, IEEE Data Eng. Bull..