pygrametl: a powerful programming framework for extract-transform-load programmers

Extract-Transform-Load (ETL) processes are used for extracting data, transforming it and loading it into data warehouses (DWs). Many tools for creating ETL processes exist. The dominating tools all use graphical user interfaces (GUIs) where the developer visually defines the data flow and operations. In this paper, we challenge this approach and propose to do ETL programming by writing code. To make the programming easy, we present the (Python-based) framework pygrametl which offers commonly used functionality for ETL development. By using the framework, the developer can efficiently create effective ETL solutions from which the full power of programming can be exploited. Our experiments show that when pygrametl is used, both the development time and running time are short when compared to an existing GUI-based tool.