Implementation of web-ETL transformation with pre-configured multi-source system connection and transformation mapping statistics report

Data warehousing provides an interesting alternative to the traditional approach of heterogeneous database integration. Rather than using a query driven approach, data warehousing employs an update driven approach in which information from multiple, heterogeneous sources is integrated in advance and stored in a warehouse for direct querying and analysis. To build a data warehouse various tools are used like modeling tools to design a warehouse, database tools to physically build the database and loading the data and programming languages to extract the data from sources, apply business transformations and load it in consistent format. The conventional process of developing custom code or scripts for this is always a costly, error prone and time consuming. In this paper we propose a web based ETL framework with unique feature of preconfigured multi source connection which can be stored and used in future if needed to perform sequence of transformations. A viewable transformation report with time taken to perform the transformations and mapping source to target metadata is made available that provides scope to user to measure data quality and accuracy. Also new feature of entire loading process of data movement from source to target system is made visible to the user. The entire above mentioned things have been modeled using UML for web based approach.