A heuristic query optimizer must choose the best way to process an incoming query. This choice is based on comparing the expected cost of many (or all) of the ways that a command might be processed. This expected cost calculation is determined by statistics on the sizes of the relations involved and the selectivities of the operations being performed. Of course, such estimates are subject to error, and in this paper we investigate the sensitivity of the best query plan to errors in the selectivity estimates. We treat the common case of join queries and show that the optimal plan for most queries is very insensitive to selectivity inaccuracies. Hence, there is little reason for a data manager to spend a lot of effort making accurate estimates of join selectivities.
[1]
Robert Kooi,et al.
Query Optimization in INGRES.
,
1982
.
[2]
Gregory Piatetsky-Shapiro,et al.
Accurate estimation of the number of tuples satisfying a condition
,
1984,
SIGMOD '84.
[3]
Eugene Wong,et al.
Decomposition—a strategy for query processing
,
1976,
TODS.
[4]
C. J. Date.
A guide to DB2
,
1984
.
[5]
Stavros Christodoulakis,et al.
Estimating block transfers and join sizes
,
1983,
SIGMOD '83.
[6]
Neil C. Rowe,et al.
Top-down statistical estimation on a database
,
1983,
SIGMOD '83.
[7]
Patricia G. Selinger,et al.
Access path selection in a relational database management system
,
1979,
SIGMOD '79.