Adding a Performance-Oriented Perspective to Data Warehouse Design

Data warehouse design is clearly dominated by the business perspective. Quite often, data warehouse administrators are lead to data models with little room for performance improvement. However, the increasing demands for interactive response time from the users make query performance one of the central problems of data warehousing today. In this paper we defend that data warehouse design must take into account both the business and the performance perspective from the beginning, and we propose the extension to typical design methodologies to include performance concerns in the early design steps. Specific analysis to predicted data warehouse usage profile and meta-data analysis are proposed as new inputs for improving the transition from logical to physical schema. The proposed approach is illustrated and discussed using the TPC-H performance benchmark and it is shown that significant performance improvement can be achieved without jeopardizing the business view required for data warehouse models.

[1]  Jorge Bernardino,et al.  Experimental evaluation of a new distributed partitioning technique for data warehouses , 2001, Proceedings 2001 International Database Engineering and Applications Symposium.

[2]  Patrick Valduriez,et al.  Join indices , 1987, TODS.

[3]  Patrick E. O'Neil,et al.  Improved query performance with variant indexes , 1997, SIGMOD '97.

[4]  Patrick Valduriez,et al.  Prototyping Bubba, A Highly Parallel Database System , 1990, IEEE Trans. Knowl. Data Eng..

[5]  Jorge Bernardino,et al.  Approximate Query Answering Using Data Warehouse Striping , 2002, Journal of Intelligent Information Systems.

[6]  Andreas Reuter,et al.  Tandem Database Group - NonStop SQL: A Distributed, High-Performance, High-Availability Implementation of SQL , 1987, HPTS.

[7]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[8]  Michael Stonebraker,et al.  The Postgres DBMS , 1990, SIGMOD Conference.

[9]  Ralph Kimball,et al.  The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing and Deploying Data Warehouses with CD Rom , 1998 .

[10]  Joseph M. Hellerstein Online Processing Redux , 1997, IEEE Data Eng. Bull..

[11]  Luca Cabibbo,et al.  The Design and Development of a Logical System for OLAP , 2000, DaWaK.

[12]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[13]  Star Queries in Oracle8 ™ , 1997 .

[14]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[15]  Pedro Furtado,et al.  Analysis of Accuracy of Data Reduction Techniques , 1999, DaWaK.

[16]  Hans-Joachim Lenz,et al.  Tree Based Indexes vs. Bitmap Indexes - a Performance Study , 1999, DMDW.

[17]  Matteo Golfarelli,et al.  Applying Vertical Fragmentation Techniques in Logical Design of Multidimensional Databases , 2000, DaWaK.

[18]  Hans-Joachim Lenz,et al.  Tree Based Indexes Versus Bitmap Indexes: A Performance Study , 2001, Int. J. Cooperative Inf. Syst..