Designing Data Warehouses with OO Conceptual Models

M ost developers agree that data warehouse, multidimensional database (MDB), and online analytical processing (OLAP) applications emphasize multidimen-sional modeling, which offers two benefits. First, the multidimensional model closely parallels how data analyzers think and, therefore, helps users understand data. Second, this approach helps predict what final users want to do, thereby facilitating performance improvements. Developers have proposed various approaches for the conceptual design of multidimensional systems. These proposals try to represent the main multidi-mensional properties at the conceptual level with special emphasis on data structures. A conceptual modeling approach for data warehouses , however, should also address other relevant aspects such as initial user requirements, system behavior , available data sources, and specific issues related to automatic generation of the database schemes. We believe that object orientation with the Unified Modeling Language can provide an adequate notation for modeling every aspect of a data warehouse system from user requirements to implementation. We propose an OO approach to accomplish the conceptual modeling of data warehouses, MDB, and OLAP applications. This approach introduces a set of minimal constraints and extensions to UML 1 for representing multidimensional modeling properties for these applications. We base these extensions on the standard mechanisms that UML provides for adapting itself to a specific method or model, such as constraints and tagged values. Our work builds on previous research, 2-4 which provided a foundation for the results we report here and for earlier versions of our work. We believe that our innovative approach provides a theoretical foundation for the use of OO databases and object-relational databases in data warehouses, MDB, and OLAP applications. We use UML to design data warehouses because it considers an information system's structural and dynamic properties at the conceptual level more naturally than do classic approaches such as the Entity-Relationship model. Further, UML provides powerful mechanisms—such as the Object Constraint Language 1 and the Object Query Language 1 —for embedding data warehouse constraints and initial user requirements in the conceptual model. This approach to modeling a data warehouse system yields simple yet powerful extended UML class diagrams that represent main data warehouse properties at the conceptual level. Multidimensional modeling structures information into facts and dimensions. We define a fact as an item of interest for an enterprise, and describe it through a set of attributes called measures or fact attributes—atomic or derived—which are contained in cells or points within the data cube. We base …