Modeling and managing ETL processes

Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization and insertion into a data warehouse. The design, development and deployment of ETL processes, which is currently, performed in an ad-hoc, in house fashion, needs modeling, design and methodological foundations. Unfortunately, the research community has a lot of work to do to confront this shortcoming. Our research explores a coherent framework for the conceptual, the logical, and the physical design of ETL processes. We delve into the modeling of ETL activities and provide a conceptual and a logical abstraction for the representation of these processes. Moreover, we focus on the optimization of the ETL processes, in order to minimize the execution time of an ETL process.

[1]  Gottfried Vossen,et al.  Conceptual data warehouse modeling , 2000, DMDW.

[2]  Matteo Golfarelli,et al.  The Dimensional Fact Model: A Conceptual Model for Data Warehouses , 1998, Int. J. Cooperative Inf. Syst..

[3]  Panos Vassiliadis,et al.  Gulliver in the land of data warehousing: practical experiences and observations of a researcher , 2000, DMDW.

[4]  Diego Calvanese,et al.  A Principled Approach to Data Integration and Reconciliation in Data Warehousing , 1999, DMDW.

[5]  Joseph M. Hellerstein,et al.  An Interactive Framework for Data Cleaning and Transformation , 1999 .

[6]  Ivar Jacobson,et al.  The Unified Modeling Language User Guide , 1998, J. Database Manag..

[7]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[8]  Panos Vassiliadis,et al.  A Methodology for the Conceptual Modeling of ETL Processes , 2003, CAiSE Workshops.

[9]  Barbara Dinter,et al.  Extending the E/R Model for the Multidimensional Paradigm , 1998, ER Workshops.

[10]  Dennis Shasha,et al.  AJAX: an extensible data cleaning tool , 2000, SIGMOD '00.

[11]  Timos K. Sellis,et al.  ARKTOS: towards the modeling, design, control and execution of ETL processes , 2001, Inf. Syst..

[12]  Panos Vassiliadis,et al.  Conceptual modeling for ETL processes , 2002, DOLAP '02.

[13]  Panos Vassiliadis,et al.  Modeling ETL activities as graphs , 2002, DMDW.

[14]  Bao Yu StarChainedER: a Conceptual Model for Data Warehouse Design , 2005 .

[15]  Ivar Jacobson,et al.  Unified Modeling Language , 2020, Definitions.

[16]  Panos Vassiliadis,et al.  A Framework for the Design of ETL Scenarios , 2003, CAiSE.

[17]  Nectaria Tryfona,et al.  starER: a conceptual model for data warehouse design , 1999, DOLAP '99.

[18]  Mokrane Bouzeghoub,et al.  Modeling the Data Warehouse Refreshment Process as a Workflow Application , 1999, DMDW.

[19]  Manuel Palomar,et al.  Applying Object-Oriented Conceptual Modeling Techniques to the Design of Multidimensional Databases and OLAP Applications , 2000, Web-Age Information Management.

[20]  Hector Garcia-Molina,et al.  Efficient resumption of interrupted warehouse loads , 2000, SIGMOD '00.