Froid: Optimization of Imperative Programs in a Relational Database

For decades, RDBMSs have supported declarative SQL as well as imperative functions and procedures as ways for users to express data processing tasks. While the evaluation of declarative SQL has received a lot of attention resulting in highly sophisticated techniques, the evaluation of imperative programs has remained naive and highly inefficient. Imperative programs offer several benefits over SQL and hence are often preferred and widely used. But unfortunately, their abysmal performance discourages, and even prohibits their use in many situations. We address this important problem that has hitherto received little attention. We present Froid, an extensible framework for optimizing imperative programs in relational databases. Froid's novel approach automatically transforms entire User Defined Functions (UDFs) into relational algebraic expressions, and embeds them into the calling SQL query. This form is now amenable to cost-based optimization and results in efficient, set-oriented, parallel plans as opposed to inefficient, iterative, serial execution of UDFs. Froid's approach additionally brings the benefits of many compiler optimizations to UDFs with no additional implementation effort. We describe the design of Froid and present our experimental evaluation that demonstrates performance improvements of up to multiple orders of magnitude on real workloads.

[1]  S. Sudarshan,et al.  Decorrelation of user defined function invocations in queries , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[2]  Umeshwar Dayal,et al.  Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers , 1987, VLDB.

[3]  Goetz Graefe,et al.  The Volcano optimizer generator: extensibility and efficient search , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[4]  S. Sudarshan,et al.  Extracting Equivalent SQL from Imperative Code in Database Applications , 2016, SIGMOD Conference.

[5]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[6]  Jeffrey D. Ullman,et al.  Flow graph reducibility , 1972, SIAM J. Comput..

[7]  Ken Kennedy,et al.  Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .

[8]  Won Kim,et al.  On optimizing an SQL-like nested query , 1982, TODS.

[9]  Torsten Grabs,et al.  Execution strategies for SQL subqueries , 2007, SIGMOD '07.

[10]  Hamid Pirahesh,et al.  Complex query decorrelation , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[11]  Craig Freedman,et al.  Hekaton: SQL server's memory-optimized OLTP engine , 2013, SIGMOD '13.

[12]  Harry K. T. Wong,et al.  Optimization of nested SQL queries revisited , 1987, SIGMOD '87.

[13]  César A. Galindo-Legaria,et al.  Orthogonal optimization of subqueries and aggregation , 2001, SIGMOD '01.

[14]  Alvin Cheung,et al.  Optimizing database-backed applications with query synthesis , 2013, PLDI.

[15]  Alfons Kemper,et al.  Unnesting Arbitrary Queries , 2015, BTW.

[16]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[17]  Alvin Cheung,et al.  StatusQuo: Making Familiar Abstractions Perform Using Program Analysis , 2013, CIDR.

[18]  Robin Dewson Natively Compiled Stored Procedures , 2015 .

[19]  S. Sudarshan,et al.  DBridge: Translating Imperative Code to SQL , 2017, SIGMOD Conference.