Including Group-By in Query Optimization

In existing relational database systems, processing of group-by and computation of aggregate functions are always postponed until all joins are performed. In this paper, we present transformations that make it possible to push group-by operation past one or more joins and can potentially reduce the cost of processing a query significantly. Therefore, the placement of group-by should be decided based on cost estimation. We explain how the traditional System-R style optimizers can be modified by incorporating the greedy conservative heuristic that we developed. We prove that applications of greedy conservative heuristic produce plans that are better (or no worse) than the plans generated by a traditional optimizer. Our experimental study shows that the extent of improvement in the quality of plans is significant with only a modest increase in optimization cost. Our technique also applies to optimization of Select Distinct queries by pushing down duplicate elimination in a cost-based fashion.

[1]  Harry K. T. Wong,et al.  Optimization of nested SQL queries revisited , 1987, SIGMOD '87.

[2]  Younkyung Cha Kang Randomized Algorithms for Query Optimization , 1991 .

[3]  Per-Ake Larson,et al.  Performing Group-By before Join , 1994, ICDE 1994.

[4]  Umeshwar Dayal,et al.  Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers , 1987, VLDB.

[5]  Anthony C. Klug Access paths in the "Abe" statistical query facility , 1982, SIGMOD '82.

[6]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[7]  李幼升,et al.  Ph , 1989 .

[8]  Won Kim,et al.  On optimizing an SQL-like nested query , 1982, TODS.

[9]  Yannis E. Ioannidis,et al.  Randomized algorithms for optimizing large join queries , 1990, SIGMOD '90.

[10]  Hamid Pirahesh,et al.  Extensible/rule based query rewrite optimization in Starburst , 1992, SIGMOD '92.

[11]  Randy H. Katz,et al.  An extended relational algebra with control over duplicate elimination , 1982, PODS.

[12]  M. Muralikrishna,et al.  Improved Unnesting Algorithms for Join Aggregate SQL Queries , 1992, VLDB.

[13]  Hongjun Lu,et al.  A survey on usage of SQL , 1993, SGMD.

[14]  Donald D. Chamberlin,et al.  Access Path Selection in a Relational Database Management System , 1989 .

[15]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[16]  C. J. Date A Guide to the SQL Standard , 1987 .