Queries, or calls to stored procedures/user-defined functions are often invoked multiple times, either from within a loop in an application program, or from the where/select clause of an outer query. When the invoked query/procedure/function involves database access, a naive implementation can result in very poor performance, due to random I/O. Query decorrelation addresses this problem in the special case of nested sub-queries, but is not applicable otherwise. This problem is traditionally addressed by manually rewriting the application to make it set-oriented, by creating a batch of parameters, and by rewriting the query/procedure to work on the batch instead of one parameter at a time. Such manual rewriting is time-consuming and error prone.
In this paper, we propose techniques that can be used to do the following, (a) Automatically rewrite programs to replace multiple calls to a query by a batched call to a correspondingly rewritten query, (b) Rewrite a stored procedure/function to accept a batch of bindings, instead of a single binding. Thereby, for example, a query which would have been invoked many times from different invocations of a stored procedure would be automatically replaced by one (or a few) invocations of a batched version of the query.
Our techniques can be applied to code written in any language, such as procedural versions of SQL, or Java. We have implemented the proposed rewriting techniques for a subset of Java, where database operations are performed using an API over JDBC. We demonstrate the benefits due to our rewrites with three cases from real-world applications, which faced significant performance problems due to repeated invocations of queries/procedures.
[1]
Lawrence Rauchwerger,et al.
Parallelizing while loops for multiprocessor systems
,
1995,
Proceedings of 9th International Parallel Processing Symposium.
[2]
Goetz Graefe,et al.
Executing Nested Queries
,
2003,
BTW.
[3]
Umeshwar Dayal,et al.
Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers
,
1987,
VLDB.
[4]
César A. Galindo-Legaria,et al.
Orthogonal optimization of subqueries and aggregation
,
2001,
SIGMOD '01.
[5]
Latha S. Colby.
A recursive algebra and query optimization for nested relations
,
1989,
SIGMOD '89.
[6]
David J. DeWitt,et al.
A transformation-based approach to optimizing loops in database programming languages
,
1992,
SIGMOD '92.
[7]
Steven S. Muchnick,et al.
Advanced Compiler Design and Implementation
,
1997
.
[8]
Hamid Pirahesh,et al.
Complex query decorrelation
,
1996,
Proceedings of the Twelfth International Conference on Data Engineering.
[9]
Won Kim,et al.
On optimizing an SQL-like nested query
,
1982,
TODS.
[10]
Harry K. T. Wong,et al.
Optimization of nested SQL queries revisited
,
1987,
SIGMOD '87.
[11]
Ken Kennedy,et al.
Loop distribution with arbitrary control flow
,
1990,
Proceedings SUPERCOMPUTING '90.