Integrating vertical and horizontal partitioning into automated physical database design

In addition to indexes and materialized views, horizontal and vertical partitioning are important aspects of physical design in a relational database system that significantly impact performance. Horizontal partitioning also provides manageability; database administrators often require indexes and their underlying tables partitioned identically so as to make common operations such as backup/restore easier. While partitioning is important, incorporating partitioning makes the problem of automating physical design much harder since: (a) The choices of partitioning can strongly interact with choices of indexes and materialized views. (b) A large new space of physical design alternatives must be considered. (c) Manageability requirements impose a new constraint on the problem. In this paper, we present novel techniques for designing a scalable solution to this integrated physical design problem that takes both performance and manageability into account. We have implemented our techniques and evaluated it on Microsoft SQL Server. Our experiments highlight: (a) the importance of taking an integrated approach to automated physical design and (b) the scalability of our techniques.

[1]  Erhard Rahm,et al.  Multi-Dimensional Database Allocation for Parallel Data Warehouses , 2000, VLDB.

[2]  Hasan Pirkul,et al.  An integrated model of record segmentation and access path selection for databases , 1988, Inf. Syst..

[3]  David J. DeWitt,et al.  Weaving Relations for Cache Performance , 2001, VLDB.

[4]  Domenico Saccà,et al.  Database partitioning in a cluster of processors , 1983, TODS.

[5]  Daniel C. Zilio,et al.  Partitioning Key Selection for a Shared-nothing Parallel Database System , 1994 .

[6]  Surajit Chaudhuri,et al.  AutoAdmin “what-if” index analysis utility , 1998, SIGMOD '98.

[7]  Daniel C. Zilio,et al.  Physical database design decision algorithms and concurrent reorganization for parallel database systems , 1998 .

[8]  Alfons Kemper,et al.  Experience Report: Exploiting Advanced Database Optimization Features for Large-Scale SAP R/3 Installations , 2002, VLDB.

[9]  Surajit Chaudhuri,et al.  Automated Selection of Materialized Views and Indexes in SQL Databases , 2000, VLDB.

[10]  Surajit Chaudhuri,et al.  Index merging , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[11]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[12]  Philip S. Yu,et al.  An Effective Approach to Vertical Partitioning for Physical Design of Relational Databases , 1990, IEEE Trans. Software Eng..

[13]  Surajit Chaudhuri,et al.  An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server , 1997, VLDB.

[14]  Daniel C. Zilio,et al.  DB2 advisor: an optimizer smart enough to recommend its own indexes , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[15]  Chun Zhang,et al.  Automating physical database design in a parallel database , 2002, SIGMOD '02.

[16]  Jeffrey D. Ullman,et al.  Index selection for OLAP , 1997, Proceedings 13th International Conference on Data Engineering.

[17]  Anastasia Ailamaki,et al.  AutoPart: automating schema design for large scientific databases using data partitioning , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[18]  Shamkant B. Navathe,et al.  Vertical partitioning for database design: a graphical algorithm , 1989, SIGMOD '89.