A Framework for User-Centered Declarative ETL

As business requirements evolve with increasing information density and velocity, there is a growing need for efficiency and automation of Extract-Transform-Load (ETL) processes. Current approaches for the modeling and optimization of ETL processes provide platform-independent optimization solutions for the (semi-)automated transition among different abstraction levels, focusing on cost and performance. However, the suggested representations are not abstract enough to communicate business requirements and the role of the process quality in a user-centered perspective has not yet been adequately examined. In this paper, we introduce a novel methodology for the end-to-end design of ETL processes that takes under consideration both functional and non-functional requirements. Based on existing work, we raise the level of abstraction for the conceptual representation of ETL operations and we show how process quality characteristics can generate specific patterns on the process design.

[1]  Wolfgang Lehner,et al.  GCIP: exploiting the generation and optimization of integration processes , 2009, EDBT '09.

[2]  Felix Wortmann,et al.  An architecture for ad-hoc and collaborative business intelligence , 2010, EDBT '10.

[3]  Esteban Zimányi,et al.  A BPMN-Based Design and Maintenance Framework for ETL Processes , 2013, Int. J. Data Warehous. Min..

[4]  Vasileios Theodorou,et al.  Bijoux: Data Generator for Evaluating ETL Process Quality , 2014, DOLAP '14.

[5]  Ladjel Bellatreche,et al.  Semantic Data Warehouse Design: From ETL to Deployment à la Carte , 2013, DASFAA.

[6]  Alberto Abelló,et al.  Integrating ETL Processes from Information Requirements , 2012, DaWaK.

[7]  Wolfgang Lehner,et al.  Quality measures for ETL processes: from goals to implementation , 2014, Concurr. Comput. Pract. Exp..

[8]  Panos Vassiliadis,et al.  A taxonomy of ETL activities , 2009, DOLAP.

[9]  Timos K. Sellis,et al.  Optimizing ETL processes in data warehouses , 2005, 21st International Conference on Data Engineering (ICDE'05).

[10]  Jose-Norberto Mazón,et al.  Automatic generation of ETL processes from conceptual models , 2009, DOLAP.

[11]  Mathias Weske,et al.  Business Process Management: Concepts, Languages, Architectures , 2007 .

[12]  Kevin Wilkinson,et al.  Automating the loading of business process data warehouses , 2009, EDBT '09.

[13]  Wolfgang Lehner,et al.  Multi-objective scheduling for real-time data warehouses , 2009, Computer Science - Research and Development.

[14]  Kevin Wilkinson,et al.  Leveraging Business Process Models for ETL Design , 2010, ER.

[15]  Alberto Abelló,et al.  GEM: Requirement-Driven Generation of ETL and Multidimensional Conceptual Designs , 2011, DaWaK.

[16]  Matteo Golfarelli,et al.  Sprint Planning Optimization in Agile Data Warehouse Design , 2012, DaWaK.