Extracting Equivalent SQL from Imperative Code in Database Applications

Optimizing the performance of database applications is an area of practical importance, and has received significant attention in recent years. In this paper we present an approach to this problem which is based on extracting a concise algebraic representation of (parts of) an application, which may include imperative code as well as SQL queries. The algebraic representation can then be translated into SQL to improve application performance, by reducing the volume of data transferred, as well as reducing latency by minimizing the number of network round trips. Our techniques can be used for performing optimizations of database applications that techniques proposed earlier cannot perform. The algebraic representations can also be used for other purposes such as extracting equivalent queries for keyword search on form results. Our experiments indicate that the techniques we present are widely applicable to real world database applications, in terms of successfully extracting algebraic representations of application behavior, as well as in terms of providing performance benefits when used for optimization.

[1]  Goetz Graefe The Cascades Framework for Query Optimization , 1995, IEEE Data Eng. Bull..

[2]  Meihui Zhang,et al.  Reverse engineering complex join queries , 2013, SIGMOD '13.

[3]  Willy Zwaenepoel,et al.  JReq: Database Queries in Imperative Languages , 2010, CC.

[4]  Bjorn De Sutter,et al.  Compiler techniques for code compaction , 2000, TOPL.

[5]  Alvin Cheung,et al.  Optimizing database-backed applications with query synthesis , 2013, PLDI.

[6]  César A. Galindo-Legaria,et al.  Orthogonal optimization of subqueries and aggregation , 2001, SIGMOD '01.

[7]  Beng Chin Ooi,et al.  Towards unified ad-hoc data processing , 2014, SIGMOD Conference.

[8]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[9]  S. Sudarshan,et al.  Holistic optimization by prefetching query results , 2012, SIGMOD Conference.

[10]  Jeffrey D. Ullman,et al.  Flow graph reducibility , 1972, SIAM J. Comput..

[11]  Jennifer Widom,et al.  A First Course in Database Systems , 1997 .

[12]  S. Sudarshan,et al.  Rewriting procedures for batched bindings , 2008, Proc. VLDB Endow..

[13]  Sanjit A. Seshia,et al.  Combinatorial sketching for finite programs , 2006, ASPLOS XII.

[14]  Manu Sridharan,et al.  Translating imperative code to MapReduce , 2014, OOPSLA 2014.

[15]  Frank Tip,et al.  A survey of program slicing techniques , 1994, J. Program. Lang..

[16]  William R. Cook,et al.  Interprocedural query extraction for transparent persistence , 2008, OOPSLA.

[17]  Donald Kossmann,et al.  AJAXSearch: crawling, indexing and searching web 2.0 applications , 2008, Proc. VLDB Endow..

[18]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[19]  S. Sudarshan,et al.  Program Transformations for Asynchronous and Batched Query Submission , 2014, IEEE Transactions on Knowledge and Data Engineering.

[20]  Torsten Grust,et al.  Haskell Boards the Ferry - Database-Supported Program Execution for Haskell , 2010, IFL.

[21]  James Cheney,et al.  Query shredding: efficient relational evaluation of queries over nested multisets , 2014, SIGMOD Conference.

[22]  S. Sudarshan,et al.  Decorrelation of user defined function invocations in queries , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[23]  Goetz Graefe,et al.  The Volcano optimizer generator: extensibility and efficient search , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[24]  S. Sudarshan,et al.  DBridge: A program rewrite tool for set-oriented query execution , 2011, 2011 IEEE 27th International Conference on Data Engineering.