Automatic generation of ETL processes from conceptual models

Data warehouses (DW) integrate different data sources in order to give a multidimensional view of them to the decision-maker. To this aim, the ETL (Extraction, Transformation and Load) processes are responsible for extracting data from heterogeneous operational data sources, their transformation (conversion, cleaning, standardization, etc.), and its load in the DW. In recent years, several conceptual modeling approaches have been proposed for designing ETL processes. Although these approaches are very useful for documenting ETL processes and supporting the designer tasks, these proposals fail to give mechanisms to carry out an automatic code generation stage. Such a stage should be required to both avoid fails and save development time in the implementation of complex ETL process. Therefore, in this paper we define an approach for the automatic code generation of ETL processes. To this aim, we align the modeling of ETL processes in DW with MDA (Model Driven Architecture) by formally defining a set of QVT (Query, View, Transformation) transformations.

[1]  Martin D. Solomon Ensuring A Successful Data Warehouse Initiative , 2005, Inf. Syst. Manag..

[2]  W. H. Inmon,et al.  Building the Data Warehouse,3rd Edition , 2002 .

[3]  Panos Vassiliadis,et al.  A Methodology for the Conceptual Modeling of ETL Processes , 2003, CAiSE Workshops.

[4]  Joaquin Miller,et al.  MDA Guide Version 1.0.1 , 2003 .

[5]  Panos Vassiliadis,et al.  A generic and customizable framework for the design of ETL scenarios , 2005, Inf. Syst..

[6]  Alan R. Hevner,et al.  Integrated decision support systems: A data warehousing perspective , 2007, Decis. Support Syst..

[7]  Jose-Norberto Mazón,et al.  Modelling ETL Processes of Data Warehouses with UML Activity Diagrams , 2008, OTM Workshops.

[8]  Dimitrios Skoutas,et al.  Designing ETL processes using semantic web technologies , 2006, DOLAP '06.

[9]  Juan Trujillo,et al.  A UML Based Approach for Modeling ETL Processes in Data Warehouses , 2003, ER.

[10]  Stephen R. Gardner Building the data warehouse , 1998, CACM.

[11]  Panos Vassiliadis,et al.  A method for the mapping of conceptual designs to logical blueprints for ETL processes , 2008, Decis. Support Syst..

[12]  Jigui Sun,et al.  CommonCube-based conceptual modeling of ETL processes , 2005, 2005 International Conference on Control and Automation.

[13]  Panos Vassiliadis,et al.  Conceptual modeling for ETL processes , 2002, DOLAP '02.

[14]  Panos Vassiliadis,et al.  Data Mapping Diagrams for Data Warehouse Design with UML , 2004, ER.