Simpli-Squared: A Very Simple Yet Unexpectedly Powerful Join Ordering Algorithm Without Cardinality Estimates

The Join Order Benchmark (JOB) has become the de facto standard to assess the performance of relational database query optimizers due to its complexity and completeness. In order to compute the optimal execution plan – join order – existing solutions employ extensive data synopses and correlations – functional dependencies – between table attributes. These structures incur significant overhead to design, build, and maintain. In this paper, we present Simplicity Simplified (Simpli-Squared), a very simple join ordering algorithm that achieves unexpectedly good results. Simpli-Squared computes the join order without using any statistics or cardinality estimates. It takes as input only the referential integrity constraints declared at schema definition and the number of tuples (size) in the base tables. The join order of a given query is computed by splitting the join graph along the many-to-many joins and sorting the tables based on their size. The tables involved in one-to-many joins are greedily included based on size and the query join graph. The resulting plan can be efficiently generated by a lightweight query rewriting procedure. Experiments on the JOB benchmark in PostgreSQL show that Simpli-Squared achieves runtimes having an increase of only up to 16% – and sometimes even a reduction – compared to four state-of-the-art solutions that are considerably more intricate. Based on these results, we question whether JOB adequately tests query optimizers or if accurate cardinality estimation is such a fundamental requirement for performing well on the JOB benchmark.

[1]  Tim Kraska,et al.  Flow-Loss: Learning Cardinality Estimates That Matter , 2021, Proc. VLDB Endow..

[2]  Wolfgang Lehner,et al.  Simplicity Done Right for Join Ordering , 2021, CIDR.

[3]  Goetz Graefe,et al.  The Volcano optimizer generator: extensibility and efficient search , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[4]  Don S. Batory,et al.  Prairie: A rule specification framework for query optimizers , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[5]  Wolfgang Lehner,et al.  Cardinality estimation with local deep learning models , 2019, aiDM@SIGMOD.

[6]  Daniel P. Miranker,et al.  Rule-based query optimization, revisited , 1999, CIKM '99.

[7]  Volker Markl,et al.  Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models , 2017, Proc. VLDB Endow..

[8]  Nitesh V. Chawla,et al.  A Black-Box Approach to Query Cardinality Estimation , 2007, CIDR.

[9]  Zhengping Qian,et al.  Cardinality Estimation in DBMS: A Comprehensive Benchmark Evaluation , 2021, Proc. VLDB Endow..

[10]  Jiannan Wang,et al.  Are We Ready For Learned Cardinality Estimation? , 2020, Proc. VLDB Endow..

[11]  Dan Suciu,et al.  Pessimistic Cardinality Estimation: Tighter Upper Bounds for Intermediate Join Cardinalities , 2019, SIGMOD Conference.

[12]  Hamid Pirahesh,et al.  Extensible/rule based query rewrite optimization in Starburst , 1992, SIGMOD '92.

[13]  Tim Kraska,et al.  Neo: A Learned Query Optimizer , 2019, Proc. VLDB Endow..

[14]  Magdalena Balazinska,et al.  An Empirical Analysis of Deep Learning for Cardinality Estimation , 2019, ArXiv.

[15]  Carsten Binnig,et al.  DeepDB , 2019, Proc. VLDB Endow..

[16]  Andreas Kipf,et al.  Learned Cardinalities: Estimating Correlated Joins with Deep Learning , 2018, CIDR.

[17]  Calisto Zuzarte,et al.  Cardinality estimation using neural networks , 2015, CASCON.

[18]  Alex Suhan,et al.  Exact Selectivity Computation for Modern In-Memory Database Query Optimization , 2019, ArXiv.

[19]  Viktor Leis,et al.  Cardinality Estimation Done Right: Index-Based Join Sampling , 2017, CIDR.

[20]  Ion Stoica,et al.  Learning to Optimize Join Queries With Deep Reinforcement Learning , 2018, ArXiv.

[21]  Sanjay Chawla,et al.  ML-based Cross-Platform Query Optimization , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[22]  David J. DeWitt,et al.  The EXODUS optimizer generator , 1987, SIGMOD '87.

[23]  Viktor Leis,et al.  Query optimization through the looking glass, and what we found running the Join Order Benchmark , 2017, The VLDB Journal.

[24]  Xi Chen,et al.  NeuroCard , 2020, Proc. VLDB Endow..

[25]  Florin Rusu,et al.  COMPASS: Online Sketch-based Query Optimization for In-Memory Databases , 2021, SIGMOD Conference.

[26]  Olga Papaemmanouil,et al.  Deep Reinforcement Learning for Join Order Enumeration , 2018, aiDM@SIGMOD.