Recently, research has addressed the probl
em of estimating progr
ess for long-running data
-
base queries. The basic idea is to “continuously” monitor execution to keep track of how much work has
been done, and at the same time to collect statistics to
arrive at a more and more refined estimate of the
total amount of work that is needed. Previous research has generally decomposed the operator tree for
the query into pipelines (or “segments”) of non-blocki
ng operators, tried to observe progress per pipeline
and then to combine progress measures
of the different pipelines into an
overall progress measure. It has
soon become apparent that pipelines of non-blocking operators are too large units and that it is necessary
to define smaller segments (e.g. containing only one join operator).
In this paper we take a more radical approach where
each operator in a query tree
is able to estimate the
progress achieved for its subtree based on the progress re
ported by its children. No
global analysis of the
query tree is needed, nor is it necessary to determin
e driver nodes or
dominant inputs. E
ach operator is
strictly independent in its progres
s estimation. Nevertheless progress estimation works fine across block
-
ing operators and for the whole quer
y tree. The technique lends itself
to a simple and clean implementa
-
tion. It is suitable for extensible database archit
ectures where the set of qu
ery processing operators is
large and possibly extended at any time. Our impl
ementation allows one to
add progress support for
operators gradually such that the sy
stem runs at any time and reports
progress whenever all operators in
the query tree support progress. We report
a prototypical implementation in the S
ECONDO
extensible
database system. Progress estimation
now is a standard feature of S
ECONDO
. To our knowledge it is the
first freely available DBMS prototype th
at includes query progress estimation.
[1]
Volker Markl,et al.
LEO - DB2's LEarning Optimizer
,
2001,
VLDB.
[2]
Philip S. Yu,et al.
Multi-query SQL Progress Indicators
,
2006,
EDBT.
[3]
Christoph Beierle,et al.
CondorCKD – Implementing an algebraic knowledge discovery system in a functional programming language
,
2006
.
[4]
Hamid Pirahesh,et al.
Robust query processing through progressive optimization
,
2004,
SIGMOD '04.
[5]
Surajit Chaudhuri,et al.
Estimating Progress of Long Running SQL Queries
,
2004,
SIGMOD Conference.
[6]
Jörg Keller.
Efficient Sampling of the Structure of Crypto Generators’ State Transition Graphs
,
2007
.
[7]
Claus Udo Hönig.
Optimales Task-Graph-Scheduling für homogene und heterogene Zielsysteme
,
2008
.
[8]
Jeffrey F. Naughton,et al.
Increasing the accuracy and coverage of SQL progress indicators
,
2005,
21st International Conference on Data Engineering (ICDE'05).
[9]
Helen J. Wang,et al.
Online aggregation
,
1997,
SIGMOD '97.
[10]
Surajit Chaudhuri,et al.
When can we trust progress estimators for SQL queries?
,
2005,
SIGMOD '05.
[11]
David J. DeWitt,et al.
Proactive re-optimization
,
2005,
SIGMOD '05.
[12]
Jeffrey F. Naughton,et al.
Toward a progress indicator for database queries
,
2004,
SIGMOD '04.
[13]
Nick Koudas,et al.
A Lightweight Online Framework For Query Progress Indicators
,
2007,
2007 IEEE 23rd International Conference on Data Engineering.