Multidimensional data modeling for complex data

Online Analytical Processing (OLAP) systems considerably ease the process of analyzing business data and have become widely used in industry. Such systems primarily employ multidimensional data models to structure their data. However current multidimensional data models fall short in their abilities to model the complex data found in some real world application domains. The paper presents nine requirements to multidimensional data models, each of which is exemplified by a real world, clinical case study. A survey of the existing models reveals that the requirements not currently met include support for many-to-many relationships between facts and dimensions, built-in support for handling chance and time, and support for uncertainty as well as different levels of granularity in the data. The paper defines an extended multidimensional data model, and an associated algebra, which address all nine requirements.

[1]  Laks V. S. Lakshmanan,et al.  A Foundation for Multi-dimensional Databases , 1997, VLDB.

[2]  Anindya Datta,et al.  A Conceptual Model and Algebra for On-Line Analytical Processing in Decision Support Databases , 2001, Inf. Syst. Res..

[3]  Arie Shoshani,et al.  Summarizability in OLAP and statistical data bases , 1997, Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150).

[4]  Sunita Sarawagi,et al.  Modeling multidimensional databases , 1997, Proceedings 13th International Conference on Data Engineering.

[5]  Sushil Jajodia,et al.  Temporal Databases: Research and Practice , 1998 .

[6]  Hector Garcia-Molina,et al.  The Management of Probabilistic Data , 1992, IEEE Trans. Knowl. Data Eng..

[7]  E. F. Codd,et al.  Providing OLAP to User-Analysts: An IT Mandate , 1998 .

[8]  Arie Shoshani,et al.  STORM: A Statistical Object Representation Model , 1990, IEEE Data Eng. Bull..

[9]  Curtis E. Dyreson,et al.  A Glossary of Time Granularity Concepts , 1997, Temporal Databases, Dagstuhl.

[10]  Ramez Elmasri,et al.  The Consensus Glossary of Temporal Database Concepts - February 1998 Version , 1997, Temporal Databases, Dagstuhl.

[11]  Christian S. Jensen,et al.  On the Semantics of , 1996 .

[12]  Peter P. Chen The Entity-Relationship Model: Towards a unified view of Data , 1976 .

[13]  Arie Shoshani,et al.  OLAP and statistical databases: similarities and differences , 1997, PODS '97.

[14]  Anthony C. Klug Equivalence of Relational Algebra and Relational Calculus Query Languages Having Aggregate Functions , 1982, JACM.

[15]  Wolfgang Lehner,et al.  A Redundancy-Based Optimization Approach for Aggregation in Multidimensional Scientific and Atatistical Databases , 1997, DASFAA.

[16]  Doron Rotem,et al.  Random sampling from databases: a survey , 1995 .

[17]  Chang Li,et al.  A data model for supporting on-line analytical processing , 1996, CIKM '96.

[18]  Wolfgang Lehner,et al.  Modelling Large Scale OLAP Scenarios , 1998, EDBT.

[19]  Richard T. Snodgrass,et al.  The TSQL2 Temporal Query Language , 1995 .

[20]  Christian S. Jensen,et al.  On the semantics of “now” in databases , 1996, TODS.

[21]  Ralph Kimball,et al.  The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses , 1996 .

[22]  Maurizio Rafanelli,et al.  Proposal of a Logical Model for Statistical Data Base , 1983, SSDBM.

[23]  Christian S. Jensen,et al.  Systematic Change Management in Dimensional Data Warehousing , 1998 .

[24]  Christian S. Jensen,et al.  Unifying Temporal Data Models via a Conceptual Model , 1994, Inf. Syst..

[25]  Alan R. Simon,et al.  Understanding the New SQL: A Complete Guide , 1993 .