Exploiting the Impact of Database System Configuration Parameters: A Design of Experiments Approach

Tuning database system configuration parameters to proper values according to the expected query workload plays a very important role in determining DBMS performance. However, the number of configuration parameters in a DBMS is very large. Furthermore, typical query workloads have a large number of constituent queries, which makes tuning very time and effort intensive. To reduce tuning time and effort, database administrators rely on their experience and some rules of thumb to select a set of important configuration parameters for tuning. Nonetheless, as a statistically rigorous methodology is not used, time and effort may be wasted by tuning those parameters which may have no or marginal effects on the DBMS performance for the given query workload. Database administrators also use compressed query workloads to reduce tuning time. If not carefully selected, the compressed query workload may fail to include a query which may reveal important performance bottleneck parameters. In this article, we provide a systematic approach to help the database administrators in tuning activities. We achieve our goals through two phases. First, we estimate the effects of the configuration parameters for each workload query. The effects are estimated through a design of experiments-based PLACKETT & BURMAN design methodology where the number of experiments required is linearly proportional to the number of input parameters. Second, we exploit the estimated effects to: 1) rank DBMS configuration parameters for a given query workload based on their impact on the DBMS performance, and 2) select a compressed query workload that preserves the fidelity of the original workload. Experimental results using PostgreSQL and TPC-H query workload show that our methodologies are working correctly.

[1]  Peter J. Haas,et al.  Making DB2Products Self-Managing: Strategies and Experiences , 2006, IEEE Data Eng. Bull..

[2]  Steve Adrien DeLuca,et al.  Microsoft SQL Server 7.0 Performance Tuning Technical Reference , 2000 .

[3]  Theodore T. Allen,et al.  Introduction to Engineering Statistics and Six Sigma: Statistical Quality Control and Design of Experiments and Systems , 2006 .

[4]  Mohamed F. Mokbel,et al.  SCRAP: A Statistical Approach for Creating a Database Query Workload Based on Performance Bottlenecks , 2007, 2007 IEEE 10th International Symposium on Workload Characterization.

[5]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[6]  Graham Wood,et al.  Automatic Performance Diagnosis and Tuning in Oracle , 2005, CIDR.

[7]  Gerhard Weikum,et al.  Rethinking Database System Architecture: Towards a Self-Tuning RISC-Style Database System , 2000, VLDB.

[8]  Mohamed F. Mokbel,et al.  SARD: A statistical approach for ranking database tuning parameters , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.

[9]  Surajit Chaudhuri,et al.  Database tuning advisor for microsoft SQL server 2005: demo , 2005, SIGMOD '05.

[10]  R. Plackett,et al.  THE DESIGN OF OPTIMUM MULTIFACTORIAL EXPERIMENTS , 1946 .

[11]  Benoît Dageville,et al.  Automatic SQL Tuning in Oracle 10g , 2004, VLDB.

[12]  Eva Kwan,et al.  Automatic Configuration for IBM ® DB2 Universal , 2002 .

[13]  Surajit Chaudhuri,et al.  Compressing SQL workloads , 2002, SIGMOD '02.

[14]  Sam Lightstone,et al.  DB2 Design Advisor: Integrated Automatic Physical Database Design , 2004, VLDB.

[15]  Surajit Chaudhuri,et al.  Database Tuning Advisor for Microsoft SQL Server 2005 , 2004, VLDB.

[16]  David J. Lilja,et al.  Measuring computer performance : A practitioner's guide , 2000 .