Permutable compiled queries: dynamically adapting compiled queries without recompiling

Just-in-time (JIT) query compilation is a technique to improve analytical query performance in database management systems (DBMSs). But the cost of compiling each query can be significant relative to its execution time. This overhead prohibits the DBMS from employingwell-known adaptive query processing (AQP)methods to generate a new plan for a query if data distributions do not match the optimizer’s estimations. The optimizer could eagerly generate multiple sub-plans for a query, but it can only include a few alternatives as each addition increases the compilation time. We present amethod, called Permutable CompiledQueries (PCQ), that bridges the gap between JIT compilation and AQP. It allows the DBMS to modify compiled queries without needing to recompile or including all possible variations before the query starts. With PCQ, the DBMS structures a query’s code with indirection layers that enable the DBMS to change the plan even while it is running. We implement PCQ in an in-memory DBMS and compare it against non-adaptive plans in a microbenchmark and against state-of-theart analytic DBMSs. Our evaluation shows that PCQ outperforms static plans by more than 4× and yields better performance on an analytical benchmark by more than 2× against other DBMSs. PVLDB Reference Format: Prashanth Menon, Amadou Ngom, Lin Ma, Todd C. Mowry, Andrew Pavlo. Permutable Compiled Queries: Dynamically Adapting Compiled Queries without Recompiling. PVLDB, 14(2): 101 113, 2021. doi:10.14778/3425879.3425882

[1]  Christoph Koch,et al.  Building Efficient Query Engines in a High-Level Language , 2014, TODS.

[2]  Shivnath Babu,et al.  Adaptive Query Processing in the Looking Glass , 2005, CIDR.

[3]  Jignesh M. Patel,et al.  Looking Ahead Makes Query Plans Robust , 2017, Proc. VLDB Endow..

[4]  Stratis Viglas,et al.  Generating code for holistic query evaluation , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[5]  Bogdan Raducanu,et al.  Micro adaptivity in Vectorwise , 2013, SIGMOD '13.

[6]  Michael Stonebraker,et al.  Predicate migration: optimizing queries with expensive predicates , 1992, SIGMOD Conference.

[7]  Sam Lightstone,et al.  Memory-Efficient Hash Joins , 2014, Proc. VLDB Endow..

[8]  Irving L. Traiger,et al.  A history and evaluation of System R , 1981, CACM.

[9]  Viktor Leis,et al.  How Good Are Query Optimizers, Really? , 2015, Proc. VLDB Endow..

[10]  Thomas Neumann,et al.  TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark , 2013, TPCTC.

[11]  Quanzhong Li,et al.  Adaptively Reordering Joins during Query Execution , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[12]  Volker Markl,et al.  LEO - DB2's LEarning Optimizer , 2001, VLDB.

[13]  Stratis Viglas Just-in-time compilation for SQL query processing , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[14]  Goetz Graefe,et al.  Optimization of dynamic query evaluation plans , 1994, SIGMOD '94.

[15]  Karen Ward,et al.  Dynamic query evaluation plans , 1989, SIGMOD '89.

[16]  J. S. Saini,et al.  Adaptive Query Processing , 2006 .

[17]  Todd C. Mowry,et al.  Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last , 2017, Proc. VLDB Endow..

[18]  Xuedong Chen,et al.  The Star Schema Benchmark and Augmented Fact Table Indexing , 2009, TPCTC.

[19]  Tilmann Rabl,et al.  Quantifying TPC-H choke points and their optimizations , 2020, Proc. VLDB Endow..

[20]  David J. DeWitt,et al.  Proactive re-optimization , 2005, SIGMOD '05.

[21]  Immanuel Trummer,et al.  SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning , 2018, Proc. VLDB Endow..

[22]  Jayant R. Haritsa,et al.  Plan bouquets: query processing without selectivity estimation , 2014, SIGMOD Conference.

[23]  Walter Binder,et al.  Dynamic speculative optimizations for SQL compilation in Apache Spark , 2020, Proc. VLDB Endow..

[24]  P. Flajolet,et al.  HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm , 2007 .

[25]  Michael Stonebraker,et al.  How I Learned to Stop Worrying and Love Re-optimization , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[26]  Alfons Kemper,et al.  Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems , 2015, SIGMOD Conference.

[27]  Andrew Pavlo,et al.  Mainlining Databases: Supporting Fast Transactional Workloads on Universal Columnar Data File Formats , 2020, Proc. VLDB Endow..

[28]  Viktor Leis,et al.  Adaptive Execution of Compiled Queries , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[29]  Tilmann Rabl,et al.  Grizzly: Efficient Stream Processing Through Adaptive Query Compilation , 2020, SIGMOD Conference.

[30]  Thomas Neumann,et al.  Efficiently Compiling Efficient Query Plans for Modern Hardware , 2011, Proc. VLDB Endow..

[31]  Hamid Pirahesh,et al.  Robust query processing through progressive optimization , 2004, SIGMOD '04.