Aggregate-based query processing in a parallel data warehouse server

In the last years data warehousing has emerged as a fundamental database technology providing the basis for online analytical processing (OLAP). In general, analytical queries involve aggregations of large data sets. This results in serious performance problems if ad-hoc queries are to be answered online. One method to avoid performance bottlenecks is to use parallel hardware, i.e. SMP or MPP machines which are able to cope with the data volume. Another optimization approach specific to data warehousing is to preaggregate some of the results in order to avoid scanning the base relations. The prototypical OLAP system CUBESTAR PARALLEL SERVER combines both approaches. In order to achieve high query performance with low hardware costs, we present a technique for the dynamic, i.e. query-behavior and load-dependent, use and management of multidimensional aggregates in a shared-nothing workstation cluster.

[1]  Hongjun Lu,et al.  Buffer and load balancing in locally distributed database systems , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[2]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[3]  Kien A. Hua,et al.  A General Multidimensional Data Allocation Method for Multicomputer Database Systems , 1997, DEXA.

[4]  Wolfgang Lehner,et al.  On-line analytical processing in distributed data warehouses , 1998, Proceedings. IDEAS'98. International Database Engineering and Applications Symposium (Cat. No.98EX156).

[5]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[6]  Wolfgang Lehner,et al.  Management of multidimensional aggregates for efficient online analytical processing , 1999, Proceedings. IDEAS'99. International Database Engineering and Applications Symposium (Cat. No.PR00265).

[7]  Donald F. Ferguson,et al.  Microeconomic algorithms for load balancing in distributed computer systems , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[8]  Yue Zhuge,et al.  Distributed and Parallel Computing Issues in Data Warehousing (Invited Talk) , 1998 .

[9]  Michael Stonebraker,et al.  Mariposa: a wide-area distributed database system , 1996, The VLDB Journal.

[10]  Jeffrey F. Naughton,et al.  Caching multidimensional queries using chunks , 1998, SIGMOD '98.

[11]  Erhard Rahm Dynamic Load Balancing in Parallel Database Systems , 1996, Euro-Par, Vol. I.

[12]  Bongki Moon,et al.  A case for parallelism in data warehousing and OLAP , 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130).