Orchid: Integrating Schema Mapping and ETL

This paper describes Orchid, a system that converts declarative mapping specifications into data flow specifications (ETL jobs) and vice versa. Orchid provides an abstract operator model that serves as a common model for both transformation paradigms; both mappings and ETL jobs are transformed into instances of this common model. As an additional benefit, instances of this common model can be optimized and deployed into multiple target environments. Orchid is being deployed in FastTrack, a data transformation toolkit in IBM Information Server.

[1]  Theodosios Pavlidis,et al.  Linear and Context-Free Graph Grammars , 1972, JACM.

[2]  Hamid Pirahesh,et al.  Extensible/rule based query rewrite optimization in Starburst , 1992, SIGMOD '92.

[3]  Laura M. Haas,et al.  Clio grows up: from research prototype to industrial tool , 2005, SIGMOD '05.

[4]  Hans-Jörg Schek,et al.  The relational model with relation-valued attributes , 1986, Inf. Syst..

[5]  Timos K. Sellis,et al.  Optimizing ETL processes in data warehouses , 2005, 21st International Conference on Data Engineering (ICDE'05).

[6]  Michael Stonebraker,et al.  Implementation of integrity constraints and views by query modification , 1975, SIGMOD '75.

[7]  Jayant Madhavan,et al.  Composing Mappings Among Data Sources , 2003, VLDB.

[8]  Paolo Papotti,et al.  Clip: a Visual Language for Explicit Schema Mappings , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[9]  Panos Vassiliadis,et al.  Optimizing ETL processes in data warehouse environments , 2005, ICDE 2005.

[10]  Mary Roth,et al.  XML mapping technology: Making connections in an XML-centric world , 2006, IBM Syst. J..

[11]  Ronald Fagin,et al.  Composing schema mappings: second-order dependencies to the rescue , 2004, PODS '04.

[12]  Erhard Rahm,et al.  Rondo: a programming platform for generic model management , 2003, SIGMOD '03.

[13]  Alkis Simitsis,et al.  Modeling and managing ETL processes , 2003, VLDB PhD Workshop.

[14]  Jennifer Widom,et al.  Database Systems: The Complete Book , 2001 .

[15]  Surajit Chaudhuri,et al.  An overview of query optimization in relational systems , 1998, PODS.

[16]  Ronald Fagin,et al.  Translating Web Data , 2002, VLDB.