An overview of query optimization in relational systems

There has been extensive work in query optimization since the early ‘70s. It is hard to capture the breadth and depth of this large body of work in a short article. Therefore, I have decided to focus primarily on the optimization of SQL queries in relational database systems and present my biased and incomplete view of this field. The goal of this article is not to be comprehensive, but rather to explain the foundations and present samplings of significant work in this area. I would like to apologize to the many contributors in this area whose work I have failed to explicitly acknowledge due to oversight or lack of space. I take the liberty of trading technical precision for ease of presentation.

[1]  David J. DeWitt,et al.  Equi-depth multidimensional histograms , 1988, SIGMOD '88.

[2]  GraefeGoetz Query evaluation techniques for large databases , 1993 .

[3]  Surajit Chaudhuri,et al.  Optimization of queries with user-defined predicates , 1996, TODS.

[4]  Paul Larson,et al.  Query Transformation for PS J-queries , 1998 .

[5]  Eugene Wong,et al.  Query processing in sdd-i: a system for distributed databases , 1979 .

[6]  Umeshwar Dayal,et al.  Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers , 1987, VLDB.

[7]  Guy M. Lohman,et al.  Optimizer Validation and Performance Evaluation for Distributed Queries , 1998 .

[8]  Michael Stonebraker,et al.  Predicate migration: optimizing queries with expensive predicates , 1992, SIGMOD Conference.

[9]  Divesh Srivastava,et al.  Answering Queries with Aggregation Using Views , 1996, VLDB.

[10]  Yannis E. Ioannidis,et al.  Selectivity Estimation Without the Attribute Value Independence Assumption , 1997, VLDB.

[11]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[12]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[13]  Waqar Hasan,et al.  Optimization of SQL Queries for Parallel Machines , 1996, Lecture Notes in Computer Science.

[14]  Eugene Wong,et al.  Query processing in a system for distributed databases (SDD-1) , 1981, TODS.

[15]  Timos K. Sellis,et al.  Parametric query optimization , 1992, The VLDB Journal.

[16]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[17]  Per-Åke Larson,et al.  Eager Aggregation and Lazy Aggregation , 1995, VLDB.

[18]  Ronald Fagin,et al.  Combining fuzzy information from multiple systems (extended abstract) , 1996, PODS.

[19]  S. B. Yao,et al.  Optimization Algorithms for Distributed Queries , 1986, IEEE Transactions on Software Engineering.

[20]  Hamid Pirahesh,et al.  Extensible/rule based query rewrite optimization in Starburst , 1992, SIGMOD '92.

[21]  Goetz Graefe,et al.  The Volcano optimizer generator: extensibility and efficient search , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[22]  Gregory Piatetsky-Shapiro,et al.  Accurate estimation of the number of tuples satisfying a condition , 1984, SIGMOD '84.

[23]  Hamid Pirahesh,et al.  Cost-based optimization for magic: algebra and implementation , 1996, SIGMOD '96.

[24]  Goetz Graefe The Cascades Framework for Query Optimization , 1995, IEEE Data Eng. Bull..

[25]  Ashish Gupta,et al.  Aggregate-Query Processing in Data Warehousing Environments , 1995, VLDB.

[26]  Yannis E. Ioannidis,et al.  Universality of Serial Histograms , 1993, VLDB.

[27]  Chad Carson,et al.  Optimizing queries over multimedia repositories , 1996, SIGMOD '96.

[28]  Alon Y. Halevy,et al.  Query Optimization by Predicate Move-Around , 1994, VLDB.

[29]  David Maier,et al.  Magic sets and other strange ways to implement logic programs (extended abstract) , 1985, PODS '86.

[30]  Surajit Chaudhuri,et al.  An Overview of Cost-based Optimization of Queries with Aggregates , 1995, IEEE Data Eng. Bull..

[31]  Guy M. Lohman,et al.  Measuring the Complexity of Join Enumeration in Query Optimization , 1990, VLDB.

[32]  Wei Hong Parallel Query Processing Using Shared Memory Multiprocessors and Disk Arrays , 1992 .

[33]  Surajit Chaudhuri,et al.  Maintenance of Materialized Views: Problems, Techniques, and Applications. , 1995 .

[34]  David J. DeWitt,et al.  The EXODUS optimizer generator , 1987, SIGMOD '87.

[35]  Hamid Pirahesh,et al.  Implementation of magic-sets in a relational database system , 1994, SIGMOD '94.

[36]  Jeffrey F. Naughton,et al.  Sampling-Based Estimation of the Number of Distinct Values of an Attribute , 1995, VLDB.

[37]  Arnon Rosenthal,et al.  Query graphs, implementing trees, and freely-reorderable outerjoins , 1990, SIGMOD '90.

[38]  Kyuseok Shim,et al.  Optimizing queries with materialized views , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[39]  Ravi Krishnamurthy,et al.  Towards on Open Architecture for LDL , 1989, VLDB.

[40]  Per-Åke Larson,et al.  Query Transformation for PSJ-Queries , 1987, VLDB.

[41]  Kyuseok Shim,et al.  Including Group-By in Query Optimization , 1994, VLDB.

[42]  Alan R. Simon,et al.  Understanding the New SQL: A Complete Guide , 1993 .

[43]  Hamid Pirahesh,et al.  Extensible query processing in starburst , 1989, SIGMOD '89.

[44]  Guy M. Lohman,et al.  R* optimizer validation and performance evaluation for local queries , 1986, SIGMOD '86.

[45]  Eugene J. Shekita,et al.  Fundamental techniques for order optimization , 1996, SIGMOD '96.

[46]  Peter J. Haas,et al.  Improved histograms for selectivity estimation of range predicates , 1996, SIGMOD '96.

[47]  Sheldon J. Finkelstein Common expression analysis in database applications , 1982, SIGMOD '82.

[48]  Kyuseok Shim,et al.  Optimizing Queries with Aggregate Views , 1996, EDBT.

[49]  Guy M. Lman Grammar-like Functional Rules for Representing Query Optimization Alternatives , 1998 .

[50]  Guy M. Lohman,et al.  Query Optimization in the IBM DB2 Family. , 1993 .

[51]  Harry K. T. Wong,et al.  Optimization of nested SQL queries revisited , 1987, SIGMOD '87.

[52]  Karen Ward,et al.  Dynamic query evaluation plans , 1989, SIGMOD '89.

[53]  Dean Daniels,et al.  Query Processing in R* , 1985, Query Processing in Database Systems.

[54]  Rajeev Motwani,et al.  Random sampling for histogram construction: how much is enough? , 1998, SIGMOD '98.

[55]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[56]  Won Kim,et al.  On optimizing an SQL-like nested query , 1982, TODS.

[57]  Hamid Pirahesh,et al.  Magic is relevant , 1990, SIGMOD '90.

[58]  M. Muralikrishna,et al.  Improved Unnesting Algorithms for Join Aggregate SQL Queries , 1992, VLDB.

[59]  David J. DeWitt,et al.  Complex query processing in multiprocessor database machines , 1990 .

[60]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[61]  Michael Stonebraker,et al.  Readings in Database Systems , 1988 .

[62]  Hamid Pirahesh,et al.  Complex query decorrelation , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[63]  Michael Stonebraker,et al.  Optimization of parallel query execution plans in XPRS , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.