Dimension Compatibility for Data Mart Integration

The problem of integrating autonomous data marts arises when, e.g., a large organization (or a federation thereof) needs to combine independently developed data warehouses. It turns out that this problem can be tackled in a systematic way because of two main reasons. First, data marts are usually structured in a rather uniform way, along dimensions and facts. Second, data quality in data marts is usually higher than in generic databases, since they are obtained by reconciling several data sources. Our scenario of reference is a federation of various data marts that we need to query in a unified way by means of drillacross operations. We propose a novel notion of dimension compatibility and characterize its general properties. We then show the significance of dimension compatibility in performing drill-across queries over autonomous data marts.

[1]  Arie Shoshani,et al.  Extending OLAP querying to external object databases , 2000, CIKM '00.

[2]  Torben Bach Pedersen,et al.  XML-extended OLAP querying , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.

[3]  José Samos,et al.  On relationships offering new drill-across possibilities , 2002, DOLAP '02.

[4]  Timos K. Sellis,et al.  A survey of logical models for OLAP databases , 1999, SGMD.

[5]  Laura M. Haas,et al.  The Clio project: managing heterogeneity , 2001, SGMD.

[6]  Luca Cabibbo,et al.  From a procedural to a visual query language for OLAP , 1998, Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243).

[7]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[8]  Venkataraman Ramesh,et al.  Management of Heterogeneous and Autonomous Database Systems , 1999 .

[9]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[10]  Richard Hull,et al.  Managing semantic heterogeneity in databases: a theoretical prospective , 1997, PODS.

[11]  Luca Cabibbo,et al.  A Logical Approach to Multidimensional Databases , 1998, EDBT.

[12]  Ralph Kimball,et al.  The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling , 1996 .