Rewriting procedures for batched bindings

Queries, or calls to stored procedures/user-defined functions are often invoked multiple times, either from within a loop in an application program, or from the where/select clause of an outer query. When the invoked query/procedure/function involves database access, a naive implementation can result in very poor performance, due to random I/O. Query decorrelation addresses this problem in the special case of nested sub-queries, but is not applicable otherwise. This problem is traditionally addressed by manually rewriting the application to make it set-oriented, by creating a batch of parameters, and by rewriting the query/procedure to work on the batch instead of one parameter at a time. Such manual rewriting is time-consuming and error prone. In this paper, we propose techniques that can be used to do the following, (a) Automatically rewrite programs to replace multiple calls to a query by a batched call to a correspondingly rewritten query, (b) Rewrite a stored procedure/function to accept a batch of bindings, instead of a single binding. Thereby, for example, a query which would have been invoked many times from different invocations of a stored procedure would be automatically replaced by one (or a few) invocations of a batched version of the query. Our techniques can be applied to code written in any language, such as procedural versions of SQL, or Java. We have implemented the proposed rewriting techniques for a subset of Java, where database operations are performed using an API over JDBC. We demonstrate the benefits due to our rewrites with three cases from real-world applications, which faced significant performance problems due to repeated invocations of queries/procedures.