The Flatter, the Better: Query Compilation Based on the Flattening Transformation

We demonstrate the insides and outs of a query compiler based on the flattening transformation, a translation technique designed by the programming language community to derive efficient data-parallel implementations from iterative programs. Flattening admits the straightforward formulation of intricate query logic including deeply nested loops over (possibly ordered) data or the construction of rich data structures. To demonstrate the level of expressiveness that can be achieved, we will bring a compiler frontend that accepts queries embedded into the Haskell programming language. Compilation via flattening takes places in a series of simple steps all of which will be made tangible by the demonstration. The final output is a program of lifted primitive operations which existing query engines can efficiently implement. We provide backends based on PostgreSQL and VectorWise to make this point however, most set-oriented or data-parallel engines could benefit from a flattening-based query compiler.

[1]  Stefan Manegold,et al.  Cache-Conscious Radix-Decluster Projections , 2004, VLDB.

[2]  Marcin Zukowski,et al.  Vectorwise: A Vectorized Analytical DBMS , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[3]  Torsten Grust,et al.  Haskell Boards the Ferry - Database-Supported Program Execution for Haskell , 2010, IFL.

[4]  Henk M. Blanken,et al.  Translating OSQL-Queries into Efficient Set Expressions , 1996, EDBT.

[5]  Dan Suciu,et al.  Comprehension syntax , 1994, SGMD.

[6]  Torsten Grust,et al.  How to Comprehend Queries Functionally , 1999, Journal of Intelligent Information Systems.

[7]  Torsten Grust,et al.  Avalanche-safe LINQ compilation , 2010, Proc. VLDB Endow..

[8]  David Maier,et al.  Making smalltalk a database system , 1984, SIGMOD '84.

[9]  Simon Marlow,et al.  Haskell 2010 Language Report , 2010 .

[10]  Daniel W. Palmer,et al.  Transforming high-level data-parallel programs into vector operations , 1993, PPOPP '93.

[11]  James Cheney,et al.  Query shredding: efficient relational evaluation of queries over nested multisets , 2014, SIGMOD Conference.

[12]  Felix Naumann,et al.  The Stratosphere platform for big data analytics , 2014, The VLDB Journal.

[13]  Dennis Shasha,et al.  AQuery: Query Language for Ordered Data, Optimization Techniques, and Experiments , 2003, VLDB.

[14]  Guy E. Blelloch,et al.  Compiling Collection-Oriented Languages onto Massively Parallel Computers , 1990, J. Parallel Distributed Comput..

[15]  Hans-Jörg Schek,et al.  The relational model with relation-valued attributes , 1986, Inf. Syst..