Efficient mid-query re-optimization of sub-optimal query execution plans

For a number of reasons, even the best query optimizers can very often produce sub-optimal query execution plans, leading to a significant degradation of performance. This is especially true in databases used for complex decision support queries and/or object-relational databases. In this paper, we describe an algorithm that detects sub-optimality of a query execution plan during query execution and attempts to correct the problem. The basic idea is to collect statistics at key points during the execution of a complex query. These statistics are then used to optimize the execution of the query, either by improving the resource allocation for that query, or by changing the execution plan for the remainder of the query. To ensure that this does not significantly slow down the normal execution of a query, the Query Optimizer carefully chooses what statistics to collect, when to collect them, and the circumstances under which to re-optimize the query. We describe an implementation of this algorithm in the Paradise Database System, and we report on performance studies, which indicate that this can result in significant improvements in the performance of complex queries.

[1]  David J. DeWitt,et al.  OPT++ : an object-oriented implementation for extensible database query optimization , 1999, The VLDB Journal.

[2]  David J. DeWitt,et al.  Memory allocation strategies for complex decision support queries , 1998, CIKM '98.

[3]  David J. DeWitt,et al.  Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation , 1997, SIGMOD '97.

[4]  Laurent Amsaleg,et al.  Scrambling query plans to cope with unexpected delays , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[5]  Peter J. Haas,et al.  Improved histograms for selectivity estimation of range predicates , 1996, SIGMOD '96.

[6]  Gennady Antoshenkov,et al.  Dynamic optimization of index scans restricted by Booleans , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[7]  Yannis E. Ioannidis,et al.  Balancing histogram optimality and practicality for query result size estimation , 1995, SIGMOD '95.

[8]  Goetz Graefe,et al.  Optimization of dynamic query evaluation plans , 1994, SIGMOD '94.

[9]  Marcia A. Derr,et al.  Adaptive query optimization in a deductive database system , 1993, CIKM '93.

[10]  David J. DeWitt,et al.  Dynamic Memory Allocation for Multiple-Query Workloads , 1993, VLDB.

[11]  G. Antoshenkov,et al.  Dynamic query optimization in Rdb/VMS , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[12]  Timos K. Sellis,et al.  Parametric query optimization , 1992, The VLDB Journal.

[13]  Philip S. Yu,et al.  Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins , 1992, VLDB.

[14]  Stavros Christodoulakis,et al.  On the propagation of errors in the size of join results , 1991, SIGMOD '91.

[15]  Guy M. Lohman,et al.  Measuring the Complexity of Join Enumeration in Query Optimization , 1990, VLDB.

[16]  Karen Ward,et al.  Dynamic query evaluation plans , 1989, SIGMOD '89.

[17]  Philippe Flajolet,et al.  Probabilistic Counting Algorithms for Data Base Applications , 1985, J. Comput. Syst. Sci..

[18]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[19]  T. G. Price,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[20]  Eugene Wong,et al.  Decomposition—a strategy for query processing , 1976, TODS.

[21]  David J. DeWitt,et al.  Query optimization for object-relational database systems , 1999 .

[22]  Building a scalable geo-spatial dbms: Technology, implementation, and evaluation , 1997 .

[23]  M. Stonebraker,et al.  Extendability in POSTGRES. , 1987 .