Policy-Regulated Management of ETL Evolution

In this paper, we discuss the problem of performing impact prediction for changes that occur in the schema/structure of the data warehouse sources. We abstract Extract-Transform-Load (ETL) activities as queries and sequences of views. ETL activities and its sources are uniformly modeled as a graph that is annotated with policies for the management of evolution events. Given a change at an element of the graph, our method detects the parts of the graph that are affected by this change and highlights the way they are tuned to respond to it. For many cases of ETL source evolution, we present rules so that both syntactical and semantic correctness of activities are retained. Finally, we experiment with the evaluation of our approach over real-world ETL workflows used in the Greek public sector.

[1]  Gottfried Vossen,et al.  Schema versioning in data warehouses: Enabling cross-version querying via schema augmentation , 2006, Data Knowl. Eng..

[2]  George Papastefanatos,et al.  Hecataeus: A What-If Analysis Tool for Database Schema Evolution , 2008, 2008 12th European Conference on Software Maintenance and Reengineering.

[3]  George Papastefanatos,et al.  Language Extensions for the Automation of Database Schema Evolution , 2008, ICEIS.

[4]  Carsten Sapia,et al.  On Schema Evolution in Multidimensional Databases , 1999, DaWaK.

[5]  Mukesh K. Mohania,et al.  Algorithms for Adapting Materialised Views in Data Warehouses , 1996, CODAS.

[6]  Hongjun Lu,et al.  Conceptual Modeling – ER 2004 , 2004, Lecture Notes in Computer Science.

[7]  Philip A. Bernstein,et al.  A vision for management of complex models , 2000, SGMD.

[8]  George Papastefanatos,et al.  Adaptive Query Formulation to Handle Database Evolution , 2006, CAiSE Forum.

[9]  Y. Vassiliou,et al.  Hecataeus: A Framework for Representing SQL Constructs as Graphs , 2005, EMMSAD.

[10]  Alexandra Poulovassilis,et al.  Schema Evolution in Data Warehousing Environments - A Schema Transformation-Based Approach , 2004, ER.

[11]  Elke A. Rundensteiner,et al.  The CVS Algorithm for View Synchronization in Evolvable Large-Scale Information Systems , 1998, EDBT.

[12]  Erhard Rahm,et al.  Data Warehouse Scenarios for Model Management , 2000, ER.

[13]  Isidro Ramos,et al.  Advances in Database Technology — EDBT'98 , 1998, Lecture Notes in Computer Science.

[14]  Elke A. Rundensteiner,et al.  A transparent object-oriented schema change approach using view evolution , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[15]  Panos K. Chrysanthis,et al.  Database Schema Evolution through the Specification and Maintenance of Changes on Entities and Relationships , 1994, ER.

[16]  Renée J. Miller,et al.  Preserving mapping consistency under schema changes , 2004, The VLDB Journal.

[17]  John F. Roddick,et al.  A survey of schema versioning issues for database systems , 1995, Inf. Softw. Technol..

[18]  John F. Roddick,et al.  Evolution and change in data management — issues and directions , 2000, SGMD.

[19]  Zoubida Kedad,et al.  A Logical Model for Data Warehouse Design and Evolution , 2000, DaWaK.

[20]  George Papastefanatos,et al.  What-If Analysis for Data Warehouse Evolution , 2007, DaWaK.

[21]  Torben Bach Pedersen,et al.  Schema Evolution for Stars and Snowflakes , 2004, ICEIS.

[22]  Gottfried Vossen,et al.  Schema Versioning in Data Warehouses , 2004, ER.

[23]  Dennis Tsichritzis,et al.  The ANSI/X3/SPARC DBMS Framework Report of the Study Group on Dabatase Management Systems , 1978, Inf. Syst..

[24]  Jay Banerjee,et al.  Semantics and implementation of schema evolution in object-oriented databases , 1987, SIGMOD '87.

[25]  Roberto Zicari,et al.  A framework for schema updates in an object-oriented database system , 1991, [1991] Proceedings. Seventh International Conference on Data Engineering.

[26]  Kenneth A. Ross,et al.  Adapting materialized views after redefinitions: techniques and a performance study , 2001, Inf. Syst..

[27]  Zohra Bellahsene Schema Evolution in Data Warehouses , 2002, Knowledge and Information Systems.

[28]  Panos Vassiliadis,et al.  Graph-Based Modeling of ETL Activities with Multi-level Transformations and Updates , 2005, DaWaK.

[29]  Ramez Elmasri,et al.  Entity-Relationship Approach — ER '93 , 1993, Lecture Notes in Computer Science.

[30]  Veda C. Storey,et al.  Conceptual Modeling — ER 2000 , 2003, Lecture Notes in Computer Science.