论文信息 - Operator-Based Query Progress Estimation

Operator-Based Query Progress Estimation

Recently, research has addressed the probl em of estimating progr ess for long-running data - base queries. The basic idea is to “continuously” monitor execution to keep track of how much work has been done, and at the same time to collect statistics to arrive at a more and more refined estimate of the total amount of work that is needed. Previous research has generally decomposed the operator tree for the query into pipelines (or “segments”) of non-blocki ng operators, tried to observe progress per pipeline and then to combine progress measures of the different pipelines into an overall progress measure. It has soon become apparent that pipelines of non-blocking operators are too large units and that it is necessary to define smaller segments (e.g. containing only one join operator). In this paper we take a more radical approach where each operator in a query tree is able to estimate the progress achieved for its subtree based on the progress re ported by its children. No global analysis of the query tree is needed, nor is it necessary to determin e driver nodes or dominant inputs. E ach operator is strictly independent in its progres s estimation. Nevertheless progress estimation works fine across block - ing operators and for the whole quer y tree. The technique lends itself to a simple and clean implementa - tion. It is suitable for extensible database archit ectures where the set of qu ery processing operators is large and possibly extended at any time. Our impl ementation allows one to add progress support for operators gradually such that the sy stem runs at any time and reports progress whenever all operators in the query tree support progress. We report a prototypical implementation in the S ECONDO extensible database system. Progress estimation now is a standard feature of S ECONDO . To our knowledge it is the first freely available DBMS prototype th at includes query progress estimation.

Ralf Hartmut Güting | R. H. Güting

[1] Volker Markl,et al. LEO - DB2's LEarning Optimizer , 2001, VLDB.

[2] Philip S. Yu,et al. Multi-query SQL Progress Indicators , 2006, EDBT.

[3] Christoph Beierle,et al. CondorCKD – Implementing an algebraic knowledge discovery system in a functional programming language , 2006 .

[4] Hamid Pirahesh,et al. Robust query processing through progressive optimization , 2004, SIGMOD '04.

[5] Surajit Chaudhuri,et al. Estimating Progress of Long Running SQL Queries , 2004, SIGMOD Conference.

[6] Jörg Keller. Efficient Sampling of the Structure of Crypto Generators’ State Transition Graphs , 2007 .

[7] Claus Udo Hönig. Optimales Task-Graph-Scheduling für homogene und heterogene Zielsysteme , 2008 .

[8] Jeffrey F. Naughton,et al. Increasing the accuracy and coverage of SQL progress indicators , 2005, 21st International Conference on Data Engineering (ICDE'05).

[9] Helen J. Wang,et al. Online aggregation , 1997, SIGMOD '97.

[10] Surajit Chaudhuri,et al. When can we trust progress estimators for SQL queries? , 2005, SIGMOD '05.

[11] David J. DeWitt,et al. Proactive re-optimization , 2005, SIGMOD '05.

[12] Jeffrey F. Naughton,et al. Toward a progress indicator for database queries , 2004, SIGMOD '04.

[13] Nick Koudas,et al. A Lightweight Online Framework For Query Progress Indicators , 2007, 2007 IEEE 23rd International Conference on Data Engineering.