Efficient exploitation of similar subexpressions for query processing

Complex queries often contain common or similar subexpressions, either within a single query or among multiple queries submitted as a batch. If so, query execution time can be improved by evaluating a common subexpression once and reusing the result in multiple places. However, current query optimizers do not recognize and exploit similar subexpressions, even within the same query. We present an efficient, scalable, and principled solution to this long-standing optimization problem. We introduce a light-weight and effective mechanism to detect potential sharing opportunities among expressions. Candidate covering subexpressions are constructed and optimization is resumed to determine which, if any, such subexpressions to include in the final query plan. The chosen subexpression(s) are computed only once and the results are reused to answer other parts of queries. Our solution automatically applies to optimization of query batches, nested queries, and maintenance of multiple materialized views. It is the first comprehensive solution covering all aspects of the problem: detection, construction, and cost-based optimization. Experiments on Microsoft SQL Server show significant performance improvements with minimal overhead.

[1]  Surajit Chaudhuri,et al.  Automated Selection of Materialized Views and Indexes in SQL Databases , 2000, VLDB.

[2]  Arie Segev,et al.  Using common subexpressions to optimize multiple queries , 1988, Proceedings. Fourth International Conference on Data Engineering.

[3]  Timos K. Sellis,et al.  On the Multiple-Query Optimization Problem , 1990, IEEE Trans. Knowl. Data Eng..

[4]  Kenneth A. Ross,et al.  Reusing invariants: a new strategy for correlated queries , 1998, SIGMOD '98.

[5]  Goetz Graefe,et al.  The Volcano optimizer generator: extensibility and efficient search , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[6]  Abhinav Gupta,et al.  Optimizing Refresh of a Set of Materialized Views , 2005, VLDB.

[7]  Sheldon J. Finkelstein Common expression analysis in database applications , 1982, SIGMOD '82.

[8]  Jonathan Goldstein,et al.  Optimizing queries using materialized views: a practical, scalable solution , 2001, SIGMOD '01.

[9]  Prasan Roy,et al.  Efficient and extensible algorithms for multi query optimization , 1999, SIGMOD '00.

[10]  Shivakumar Venkataraman,et al.  Cost-based optimization of decision support queries using transient-views , 1998, SIGMOD '98.

[11]  Hamid Pirahesh,et al.  fAST refresh using mass query optimization , 2001, Proceedings 17th International Conference on Data Engineering.

[12]  Goetz Graefe The Cascades Framework for Query Optimization , 1995, IEEE Data Eng. Bull..

[13]  Krithi Ramamritham,et al.  Materialized view selection and maintenance using multi-query optimization , 2000, SIGMOD '01.

[14]  Jingren Zhou,et al.  Stacked indexed views in microsoft SQL server , 2005, SIGMOD '05.

[15]  Timos K. Sellis,et al.  Multiple-query optimization , 1988, TODS.

[16]  Kenneth A. Ross,et al.  Materialized view maintenance and integrity constraint checking: trading space for time , 1996, SIGMOD '96.