Optimization of Multi-Way Join Queries for Parallel Execution

Most of the existing relational database query optimizers generate multi-way join plans only from those linear ones to reduce the optimization overhead. For multiprocessor computer systems, this strategy seems inadequate since it may reduce the search space too much to generate near-optimal plans. In this paper we present a framework for optimization of multiway join queries in multiprocessor computer systems. The optimization process not only determines the order and method in which each join should be performed, but also determines the number of joins should be executed in parallel, and the number of processors should be allocated to each join. The preliminary performance study shows that the optimizer usually generate optimal or near-optimal plans when the number of joins is relatively small. Even when the number of joins increases, the algorithm still gives reasonably good performance. Furthermore, the optimization overhead is much lesser compared to exhaustive search.

[1]  Stanley Y. W. Su,et al.  Database computers: principle, architectures & techniques , 1988 .

[2]  Hongjun Lu,et al.  Hash-based join algorithms for multiprocessor computers with shared memory , 1990, VLDB 1990.

[3]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[4]  Yannis E. Ioannidis,et al.  Randomized algorithms for optimizing large join queries , 1990, SIGMOD '90.

[5]  Patrick Valduriez,et al.  Join and Semijoin Algorithms for a Multiprocessor Database Machine , 1984, TODS.

[6]  Klaus R. Dittrich,et al.  Design and Implementation of KARDAMOM - A Set-oriented Data Flow Database Machine , 1989, IWDM.

[7]  Arun N. Swami,et al.  Optimization of large join queries , 1988, SIGMOD '88.

[8]  Hongjun Lu,et al.  Hash-Based Join Algorithms for Multiprocessor Computers , 1990, VLDB.

[9]  Patrick Valduriez,et al.  Prototyping Bubba, A Highly Parallel Database System , 1990, IEEE Trans. Knowl. Data Eng..

[10]  David J. DeWitt,et al.  Multiprocessor Hash-Based Join Algorithms , 1985, VLDB.

[11]  Guy M. Lohman,et al.  Measuring the Complexity of Join Enumeration in Query Optimization , 1990, VLDB.

[12]  Donald D. Chamberlin,et al.  Access Path Selection in a Relational Database Management System , 1989 .

[13]  Michael Stonebraker,et al.  The Design of XPRS , 1988, VLDB.

[14]  John Aldridge Discrete optimization algorithms (with pascal programs): Maciej M Syslo, Nasingh Deo and Janusz S Kowalik, Prentice-Hall, Englewood Cliffs, NJ, USA (1983) £41.20 pp 542 , 1985, Microprocess. Microsystems.

[15]  Dean Daniels,et al.  Query Processing in R* , 1985, Query Processing in Database Systems.

[16]  Stanley Y. W. Su,et al.  Database computers : principles, architectures, and techniques , 1988 .

[17]  Masaru Kitsuregawa,et al.  Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC) , 1990, VLDB.

[18]  S. Misbah Deen,et al.  Multi-join on parallel processors , 1990, [1990] Proceedings. Second International Symposium on Databases in Parallel and Distributed Systems.

[19]  Arun N. Swami,et al.  Optimization of large join queries: combining heuristics and combinatorial techniques , 1989, SIGMOD '89.

[20]  David J. DeWitt,et al.  Complex query processing in multiprocessor database machines , 1990 .

[21]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[22]  Carlo Zaniolo,et al.  Optimization of Nonrecursive Queries , 1986, VLDB.

[23]  David J. DeWitt,et al.  Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines , 1990, VLDB.

[24]  David J. DeWitt,et al.  A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment , 1989, SIGMOD '89.

[25]  Philip S. Yu,et al.  An effective algorithm for parallelizing sort merge joins in the presence of data skew , 1990, [1990] Proceedings. Second International Symposium on Databases in Parallel and Distributed Systems.

[26]  J DeWittDavid,et al.  A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment , 1989 .