UCLEAN : A REQUIREMENT BASED OBJECT - ORIENTED ETL FRAMEWORK

Data warehouse is used to provide effective results from multidimensional data analysis. The accuracy and correctness of these results depend on the quality of the data. To improve data quality, data must be properly extracted, transformed and loaded into the data warehouse. This ETL process is the key to the success of a data warehouse. In this paper we propose a conceptual ETL framework for an object oriented data warehouse design, the framework is called UCLEAN. This framework takes into account the concept of requirements of the users .The data is extracted from different UML sources and is converted into a multidimensional model. It is then cleaned and loaded in the data warehouse. We validate the effectiveness of the framework through a case study.

[1]  Vishakha Gupta,et al.  UREM-A UML-based Requirement Engineering Model for Data Warehouse , 2011 .

[2]  Timos K. Sellis,et al.  Optimizing ETL processes in data warehouses , 2005, 21st International Conference on Data Engineering (ICDE'05).

[3]  Stephen R. Gardner Building the data warehouse , 1998, CACM.

[4]  William C. Chu,et al.  A Model-based Object-oriented Approach to Requirement Engineering (MORE) , 2007, 31st Annual International Computer Software and Applications Conference (COMPSAC 2007).

[5]  Matteo Golfarelli From User Requirements to Conceptual Design in Warehouse Design: A Survey , 2010 .

[6]  Radha Krishna Author,et al.  An Object Oriented Modeling and Implementation of Web Based ETL Process , 2010 .

[7]  Payal Pahwa,et al.  An Efficient Algorithm for Data Cleaning , 2011, Int. J. Knowl. Based Organ..

[8]  V. Saravanan,et al.  A Unified Framework and Sequential Data Cleaning Approach for a Data Warehouse , 2008 .

[9]  Thomas Redman,et al.  The impact of poor data quality on the typical enterprise , 1998, CACM.

[10]  Veronika Stefanov,et al.  A UML Profile for Modeling Data Warehouse Usage , 2007, ER Workshops.

[11]  Axel van Lamsweerde,et al.  Goal-Oriented Requirements Engineering: A Guided Tour , 2001, RE.

[12]  Matteo Golfarelli,et al.  The Dimensional Fact Model: A Conceptual Model for Data Warehouses , 1998, Int. J. Cooperative Inf. Syst..

[13]  Joseph M. Hellerstein,et al.  Potter's Wheel: An Interactive Data Cleaning System , 2001, VLDB.

[14]  Alvaro E. Monge,et al.  Adaptive detection of approximately duplicate database records and the database integration approach to information discovery , 1998 .

[15]  Richard D. Hackathorn,et al.  Using the Data Warehouse , 1994 .

[16]  Swapan Bhattacharya,et al.  Conceptual Level Design of Object Oriented Data Warehouse: Graph Semantic Based Model , 2009 .

[17]  M. Golfarelli From User Requirements to Conceptual Design in Data Warehouse Design , 2009 .

[18]  Reda Alhajj,et al.  Data warehouse architecture and design , 2008, 2008 IEEE International Conference on Information Reuse and Integration.

[19]  Ivar Jacobson,et al.  The Unified Modeling Language User Guide , 1998, J. Database Manag..

[20]  Paulraj Ponniah,et al.  Data warehousing fundamentals : a comprehensive guide for IT professionals , 2001 .

[21]  J JebamalarTamilselvi,et al.  Detection and elimination of duplicate data using token-based method for a data warehouse: a clustering based approach , 2009 .

[22]  Qiang Wang,et al.  One CWM-Based Data Transformation Method in ETL Process , 2010, 2010 2nd International Workshop on Database Technology and Applications.

[23]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[24]  Salvatore J. Stolfo,et al.  Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.