Non-exhaustive Join Ordering Search Algorithms for LJQO

In relational database systems the optimization of select-project-join queries is a combinatorial problem. The use of exhaustive search methods is prohibitive because of the exponential increase of the search space. Randomized searches are used to find near optimal plans in polynomial time. In this paper, we investigate the large join query optimization (LJQO) problem by extending randomized algorithms and implementing a 2PO algorithm as a query optimizer in a popular open-source DBMS. We compare our solution with an implementation of a genetic algorithm. Through a multidimensional test schema, we discuss pros and cons about the behavior of these algorithms. Our results show that 2PO algorithm is fast to run and the costs of generated plans are better in most cases when compared to those of the genetic algorithms.

[1]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[2]  Thomas Neumann,et al.  Query simplification: graceful degradation for join-order optimization , 2009, SIGMOD Conference.

[3]  Donald D. Chamberlin,et al.  Access Path Selection in a Relational Database Management System , 1989 .

[4]  Pete Wyckoff,et al.  Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..

[5]  Arun N. Swami,et al.  Optimization of large join queries , 1988, SIGMOD '88.

[6]  Nicolas Bruno Teaching an Old Elephant New Tricks , 2009, CIDR.

[7]  Arun N. Swami,et al.  Optimization of large join queries: combining heuristics and combinatorial techniques , 1989, SIGMOD '89.

[8]  Yannis E. Ioannidis,et al.  Left-deep vs. bushy trees: an analysis of strategy spaces and its implications for query optimization , 1991, SIGMOD '91.

[9]  Sushil J. Louis,et al.  An Empirical Comparison of Randomized Algorithms for Large Join Query Optimization , 1998, FLAIRS.

[10]  David Maier,et al.  Rapid bushy join-order optimization with Cartesian products , 1996, SIGMOD '96.

[11]  Yannis E. Ioannidis,et al.  Randomized algorithms for optimizing large join queries , 1990, SIGMOD '90.

[12]  Václav Snásel,et al.  Query optimization by Genetic Algorithms , 2005, DATESO.

[13]  Eugene Wong,et al.  Query optimization by simulated annealing , 1987, SIGMOD '87.

[14]  Marcos Sfair Sunyé,et al.  Stableness in large join query optimization , 2009, 2009 24th International Symposium on Computer and Information Sciences.

[15]  Yu Zhang,et al.  Exploiting upper and lower bounds in top-down query optimization , 2001, Proceedings 2001 International Database Engineering and Applications Symposium.

[16]  Toshihide Ibaraki,et al.  On the optimal nesting order for computing N-relational joins , 1984, TODS.

[17]  Michael C. Ferris,et al.  A Genetic Algorithm for Database Query Optimization , 1991, ICGA.

[18]  Guido Moerkotte,et al.  Heuristic and randomized optimization for the join ordering problem , 1997, The VLDB Journal.

[19]  Daniel J. Abadi,et al.  Column-stores vs. row-stores: how different are they really? , 2008, SIGMOD Conference.

[20]  Hongbin Dong,et al.  Genetic algorithms for large join query optimization , 2007, GECCO '07.