Heuristic optimization of OLAP queries in multidimensionally hierarchically clustered databases

On-line analytical processing (OLAP) is a technology that encompasses applications requiring a multidimensional and hierarchical view of data. OLAP applications often require fast response time to complex grouping/aggregation queries on enormous quantities of data. Commercial relational database management systems use mainly multiple one-dimensional indexes to process OLAP queries that restrict multiple dimensions. However, in many cases, multidimensional access methods outperform one-dimensional indexing methods.We present an architecture for multidimensional databases that are clustered with respect to multiple hierarchical dimensions. It is based on the star schema and is called CSB star. Then, we focus on heuristically optimizing OLAP queries over this schema using multidimensional access methods. Users can still formulate their queries over a traditional star scheme, which are then rewritten by the query processor over the CSB star. We exploit the different clustering features of the CSB star to efficiently process a class of typical OLAP queries. We detect special cases where the construction of an evaluation plan can be simplified and we discuss improvements of our technique.

[1]  Volker Markl,et al.  Integrating the UB-Tree into a Database System Kernel , 2000, VLDB.

[2]  Per-Åke Larson,et al.  Eager Aggregation and Lazy Aggregation , 1995, VLDB.

[3]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[4]  Betty Salzberg,et al.  Back to the future: dynamic hierarchical clustering , 1998, Proceedings 14th International Conference on Data Engineering.

[5]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[6]  Per-Ake Larson,et al.  Performing Group-By before Join , 1994, ICDE 1994.

[7]  Inderpal Singh Mumick,et al.  The Stanford Data Warehousing Project , 1995 .

[8]  Jeffrey D. Ullman,et al.  Index selection for OLAP , 1997, Proceedings 13th International Conference on Data Engineering.

[9]  Patrick E. O'Neil,et al.  Improved query performance with variant indexes , 1997, SIGMOD '97.

[10]  Volker Markl,et al.  Processing operations with restrictions in RDBMS without external sorting: the Tetris algorithm , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[11]  Ashish Gupta,et al.  Aggregate-Query Processing in Data Warehousing Environments , 1995, VLDB.

[12]  Nick Roussopoulos,et al.  DynaMat: a dynamic view management system for data warehouses , 1999, SIGMOD '99.

[13]  Kyuseok Shim,et al.  Including Group-By in Query Optimization , 1994, VLDB.

[14]  Sunita Sarawagi Indexing OLAP Data , 1997, IEEE Data Eng. Bull..

[15]  Timos K. Sellis,et al.  Data Warehouse Configuration , 1997, VLDB.

[16]  Volker Markl,et al.  Improving OLAP performance by multidimensional hierarchical clustering , 1999, Proceedings. IDEAS'99. International Database Engineering and Applications Symposium (Cat. No.PR00265).

[17]  Rudolf Bayer,et al.  The Universal B-Tree for Multidimensional Indexing: general Concepts , 1997, WWCA.

[18]  Divesh Srivastava,et al.  Answering Queries with Aggregation Using Views , 1996, VLDB.

[19]  Kyuseok Shim,et al.  Optimizing Queries with Aggregate Views , 1996, EDBT.

[20]  Timos K. Sellis,et al.  Answering multidimensional queries on cubes using other cubes , 2000, Proceedings. 12th International Conference on Scientific and Statistica Database Management.