Supporting the ETL-process by Web Service technologies

Extracting data from heterogeneous data sources and transferring data into the data warehouse system is one of the most cost intensive tasks in setting up and operating a data warehouse. Special tools may be used to connect different sources and target systems. In this paper, we propose an architecture which enables the flexible integration of data sources into any target database system. The approach is based on the idea of splitting the classical wrapping module into a source specific and a target specific part and establishing the communication between these components based on Web Service technology. We describe the general architecture, the use of Web Service technology to describe and dynamically integrate participating data sources and the deployment within a specific database system.

[1]  W. H. Inmon,et al.  Building the data warehouse , 1992 .

[2]  C. J. Date An introduction to database systems (7. ed.) , 1999 .

[3]  Rodolfo Alfredo Bertone,et al.  Modern database management VI Edition. Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Prentice Hall, Upper Saddle River, NJ, 2002 , 2003 .

[4]  D. Box,et al.  Simple object access protocol (SOAP) 1.1 , 2000 .

[5]  David S. Linthicum,et al.  Enterprise Application Integration , 1999 .

[6]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[7]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[8]  Ralph Kimball,et al.  Dealing with dirty data , 1996 .

[9]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[10]  J. Roy,et al.  Understanding Web services , 2001 .

[11]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[12]  Vishu Krishnamurthy,et al.  All Your Data: The Oracle Extensibility Architecture , 2001, Compontent Database Systems.

[13]  Chris Adamson,et al.  Data Warehouse Design Solutions , 1998 .

[14]  Mary Roth,et al.  Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources , 1997, VLDB.

[15]  Ralph Kimball,et al.  The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing and Deploying Data Warehouses with CD Rom , 1998 .

[16]  Bruce Jay Nelson Remote procedure call , 1981 .

[17]  Ralph Kimball,et al.  The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data , 2004 .

[18]  Klaus R. Dittrich,et al.  Component Database Systems , 2001, Morgan Kaufmann series in data management systems.

[19]  W. H. Inmon,et al.  Building the Data Warehouse,3rd Edition , 2002 .

[20]  Stéphane Bressan,et al.  Introduction to Database Systems , 2005 .

[21]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[22]  Kyuseok Shim,et al.  Query Optimization in the Presence of Foreign Functions , 1993, VLDB.

[23]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[24]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[25]  Heikki Topi,et al.  Modern Database Management , 1999 .

[26]  Weimin Du,et al.  Query Optimization in a Heterogeneous DBMS , 1992, VLDB.

[27]  Beng Chin Ooi,et al.  Multidatabase query optimization: issues and solutions , 1993, Proceedings RIDE-IMS `93: Third International Workshop on Research Issues in Data Engineering: Interoperability in Multidatabase Systems.

[28]  Christie I. Ezeife,et al.  The Use of Smart Tokens in Cleaning Integrated Warehouse Data , 2005, Int. J. Data Warehous. Min..

[29]  Hamid Pirahesh,et al.  Heterogeneous query processing through SQL table functions , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[30]  Jeffrey D. Ullman,et al.  Principles Of Database And Knowledge-Base Systems , 1979 .

[31]  Kathy Bohn Converting data for warehouses , 1997 .