Identifying robust plans through plan diagram reduction

Estimates of predicate selectivities by database query optimizers often differ significantly from those actually encountered during query execution, leading to poor plan choices and inflated response times. In this paper, we investigate mitigating this problem by replacing selectivity error-sensitive plan choices with alternative plans that provide robust performance. Our approach is based on the recent observation that even the complex and dense "plan diagrams" associated with industrial-strength optimizers can be efficiently reduced to "anorexic" equivalents featuring only a few plans, without materially impacting query processing quality. Extensive experimentation with a rich set of TPC-H and TPC-DS-based query templates in a variety of database environments indicate that plan diagram reduction typically retains plans that are substantially resistant to selectivity errors on the base relations. However, it can sometimes also be severely counter-productive, with the replacements performing much worse. We address this problem through a generalized mathematical characterization of plan cost behavior over the parameter space, which lends itself to efficient criteria of when it is safe to reduce. Our strategies are fully non-invasive and have been implemented in the Picasso optimizer visualization tool.

[1]  Jayant R. Haritsa,et al.  Analyzing Plan Diagrams of Database Query Optimizers , 2005, VLDB.

[2]  Joseph Y. Halpern,et al.  Least expected cost query optimization: what can we expect? , 2002, PODS.

[3]  KabraNavin,et al.  Efficient mid-query re-optimization of sub-optimal query execution plans , 1998 .

[4]  J. S. Saini,et al.  Adaptive Query Processing , 2006 .

[5]  Jayant R. Haritsa,et al.  On the Production of Anorexic Plan Diagrams , 2007, VLDB.

[6]  Stavros Christodoulakis,et al.  On the propagation of errors in the size of join results , 1991, SIGMOD '91.

[7]  Guy M. Lohman,et al.  R* optimizer validation and performance evaluation for local queries , 1986, SIGMOD '86.

[8]  Surajit Chaudhuri,et al.  Self-tuning histograms: building histograms without looking at data , 1999, SIGMOD '99.

[9]  K. A. Stroud,et al.  Engineering Mathematics , 2020, Nature.

[10]  David J. DeWitt,et al.  Efficient mid-query re-optimization of sub-optimal query execution plans , 1998, SIGMOD '98.

[11]  Nicolas Bruno A critical look at the TAB benchmark for physical design tools , 2007, SGMD.

[12]  Jignesh M. Patel,et al.  Accurate Modeling of the Hybrid Hash Join Algorithm , 1994, SIGMETRICS.

[13]  Joseph Y. Halpern,et al.  Least expected cost query optimization: an exercise in utility , 1999, PODS.

[14]  David J. DeWitt,et al.  Proactive re-optimization with Rio , 2005, SIGMOD '05.

[15]  S. Sudarshan,et al.  AniPQO: Almost Non-intrusive Parametric Query Optimization for Nonlinear Cost Functions , 2003, VLDB.

[16]  F. W. Kellaway,et al.  Advanced Engineering Mathematics , 1969, The Mathematical Gazette.

[17]  Peter Slavík A Tight Analysis of the Greedy Algorithm for Set Cover , 1997, J. Algorithms.

[18]  Jayant R. Haritsa,et al.  Efficiently approximating query optimizer plan diagrams , 2008, Proc. VLDB Endow..

[19]  Hamid Pirahesh,et al.  Robust query processing through progressive optimization , 2004, SIGMOD '04.

[20]  S. Sudarshan,et al.  Parametric Query Optimization for Linear and Piecewise Linear Cost Functions , 2002, VLDB.

[21]  Volker Markl,et al.  LEO - DB2's LEarning Optimizer , 2001, VLDB.

[22]  Surajit Chaudhuri,et al.  Towards a robust query optimizer: a principled and practical approach , 2005, SIGMOD '05.

[23]  Pooja N. Darera,et al.  Robust Plans through Plan Diagram Reduction , 2008 .

[24]  David J. DeWitt,et al.  Proactive re-optimization , 2005, SIGMOD '05.