A UML-based data warehouse design method

Data warehouses are a major component of data-driven decision support systems (DSS). They rely on multidimensional models. The latter provide decision makers with a business-oriented view to data, thereby easing data navigation and analysis via On-Line Analytical Processing (OLAP) tools. They also determine how the data are stored in the data warehouse for subsequent use, not only by OLAP tools, but also by other decision support tools. Data warehouse design is a complex task, which requires a systematic method. Few such methods have been proposed to date. This paper presents a UML-based data warehouse design method that spans the three design phases (conceptual, logical and physical). Our method comprises a set of metamodels used at each phase, as well as a set of transformations that can be semi-automated. Following our object orientation, we represent all the metamodels using UML, and illustrate the formal specification of the transformations based on OMG's Object Constraint Language (OCL). Throughout the paper, we illustrate the application of our method to a case study.

[1]  Yong-Tae Park,et al.  An empirical investigation of the effects of data warehousing on decision performance , 2006, Inf. Manag..

[2]  Gottfried Vossen,et al.  Conceptual Data Warehouse Design , 2000 .

[3]  Isabelle Comyn-Wattiau,et al.  Dimension Hierarchies Design from UML Generalizations and Aggregations , 2001, ER.

[4]  Thilini Ariyachandra,et al.  Data warehouse governance: best practices at Blue Cross and Blue Shield of North Carolina , 2004, Decis. Support Syst..

[5]  Peter Thanisch,et al.  Constructing OLAP cubes based on queries , 2001, DOLAP '01.

[6]  José Samos,et al.  Benefits of an Object-Oriented Multidimensional Data Model , 2000, Objects and Databases.

[7]  Barbara Dinter,et al.  Finding your way through multidimensional data models , 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130).

[8]  Luca Cabibbo,et al.  A Logical Approach to Multidimensional Databases , 1998, EDBT.

[9]  Verónika Peralta,et al.  Using Design Guidelines to Improve Data Warehouse Logical Design , 2003, DMDW.

[10]  Angelo Brayner,et al.  X-META: A methodology for data warehouse design with metadata management , 2002, DMDW.

[11]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[12]  Jacky Akoka,et al.  From UML to ROLAP Multidimensional Databases using a Pivot Model , 2002, BDA.

[13]  Maurizio Rafanelli,et al.  Proposal of a Logical Model for Statistical Data Base , 1983, SSDBM.

[14]  Nectaria Tryfona,et al.  starER: a conceptual model for data warehouse design , 1999, DOLAP '99.

[15]  Karen C. Davis,et al.  Automating data warehouse conceptual schema design and evaluation , 2002, DMDW.

[16]  Christer Carlsson,et al.  Past, present, and future of decision support technology , 2002, Decis. Support Syst..

[17]  Barbara Dinter,et al.  Extending the E/R Model for the Multidimensional Paradigm , 1998, ER Workshops.

[18]  Torben Bach Pedersen,et al.  Multidimensional data modeling for complex data , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[19]  Matthias Jarke,et al.  Fundamentals of Data Warehouses , 2000, Springer Berlin Heidelberg.

[20]  Matteo Golfarelli,et al.  A methodological framework for data warehouse design , 1998, DOLAP '98.

[21]  Samira Si-Said Cherfi,et al.  Multidimensional Schemas Quality: Assessing and Balancing Analyzability and Simplicity , 2003, ER.

[22]  Arie Shoshani,et al.  Summarizability in OLAP and statistical data bases , 1997, Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150).

[23]  Daniel L. Moody,et al.  From enterprise models to dimensional models: a methodology for data warehouse and data mart design , 2000, DMDW.

[24]  Sergio Luján-Mora,et al.  A Comprehensive Method for Data Warehouse Design , 2003, DMDW.

[25]  Timos K. Sellis,et al.  MAC: Conceptual data modeling for OLAP , 2001, DMDW.

[26]  Anindya Datta,et al.  The cube data model: a conceptual model and algebra for on-line analytical processing in data warehouses , 1999, Decis. Support Syst..

[27]  A Min Tjoa,et al.  An Object Oriented Multidimensional Data Model for OLAP , 2000, Web-Age Information Management.

[28]  Sunita Sarawagi,et al.  Modeling multidimensional databases , 1997, Proceedings 13th International Conference on Data Engineering.

[29]  Stefano Paraboschi,et al.  Designing data marts for data warehouses , 2001, TSEM.

[30]  Sergei Arkhipentov,et al.  Oracle Express Olap , 2001 .

[31]  Timos K. Sellis,et al.  A survey of logical models for OLAP databases , 1999, SGMD.

[32]  Stephen R. Gardner Building the data warehouse , 1998, CACM.

[33]  Jan W. Buzydlowski,et al.  A framework for object-oriented on-line analytic processing , 1998, DOLAP '98.

[34]  Wolfgang Lehner,et al.  Normal forms for multidimensional databases , 1998, Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243).