Automating Statistics Management for Query Optimizers

Statistics play a key role in influencing the quality of plans chosen by a database query optimizer. In this paper, we identify the statistics that are essential for an optimizer. We introduce novel techniques that help significantly reduce the set of statistics that need to be created without sacrificing the quality of query plans generated. We discuss how these techniques can be leveraged to automate statistics management in databases. We have implemented and experimentally evaluated our approach on Microsoft SQL Server 7.0.

[1]  Yannis E. Ioannidis,et al.  Selectivity Estimation Without the Attribute Value Independence Assumption , 1997, VLDB.

[2]  Surajit Chaudhuri,et al.  Automating statistics management for query optimizers , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[3]  Rajeev Motwani,et al.  Towards estimation error guarantees for distinct values , 2000, PODS.

[4]  Rajeev Motwani,et al.  Random sampling for histogram construction: how much is enough? , 1998, SIGMOD '98.

[5]  Yannis E. Ioannidis,et al.  Balancing histogram optimality and practicality for query result size estimation , 1995, SIGMOD '95.

[6]  Henk M. Blanken,et al.  Index selection in relational databases , 1993, Proceedings of ICCI'93: 5th International Conference on Computing and Information.

[7]  Yossi Matias,et al.  Fast incremental maintenance of approximate histograms , 1997, TODS.

[8]  Shamkant B. Navathe,et al.  Adaptive and Automated Index Selection in RDBMS , 1992, EDBT.

[9]  Donald R. Slutz,et al.  Massive Stochastic Testing of SQL , 1998, VLDB.

[10]  Peter J. Haas,et al.  Improved histograms for selectivity estimation of range predicates , 1996, SIGMOD '96.

[11]  Bruce G. Lindsay,et al.  Approximate medians and other quantiles in one pass and with limited memory , 1998, SIGMOD '98.

[12]  Jeffrey F. Naughton,et al.  Sampling-Based Estimation of the Number of Distinct Values of an Attribute , 1995, VLDB.

[13]  Wilburt Labio,et al.  Physical database design for data warehouses , 1997, Proceedings 13th International Conference on Data Engineering.

[14]  M. Schkolnick,et al.  Physical database design for relational databases , 1988, TODS.

[15]  Surajit Chaudhuri,et al.  An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server , 1997, VLDB.