Multi-weighted tree based query optimization method for parallel relational database systems

A multi-weighted tree based query optimization method for parallel relational databases is proposed. The method consists of a multi-weighted tree based parallel query plan model, a cost model for parallel query plans and a query optimizer. The parallel query plan model models three types of parallelism of query execution, processor and memory allocation to operations, memory allocation to buffers in pipelines and data redistribution among processors. The cost model takes the waiting time of operations in pipelining execution into consideration and is computable in a bottom-up fashion. The query optimizer addresses the query optimization problem in the context of Select-Project-Join queries. Heuristics for determining the processor allocation to operations and the memory allocation to operations and buffers in pipelines are derived and used in the query optimizer. In addition, the query optimizer considers multiple join algorithms, and can make an optimal choice of join algorithm for each join operation in a query.

[1]  Hongjun Lu,et al.  On Resource Scheduling of Multi-Join Queries in Parallel Database Systems , 1993, Inf. Process. Lett..

[2]  Philip S. Yu,et al.  Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins , 1992, VLDB.

[3]  Hongjun Lu,et al.  Pipeline Processing of Multi-Way Queries in Shared-Memory Systems , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[4]  Patrick Valduriez,et al.  On the Effectiveness of Optimization Search Strategies for Parallel Execution Spaces , 1993, VLDB.

[5]  David J. DeWitt,et al.  Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines , 1990, VLDB.

[6]  Li Jian Parallel CMD -Join Algorithms on Parallel Databases , 1998 .

[7]  Norbert Duppel Modeling and optimization of complex database queries in a shared-nothing system , 1991, Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing.

[8]  Hongjun Lu,et al.  Optimization of Multi-Way Join Queries for Parallel Execution , 1991, VLDB.

[9]  David J. DeWitt,et al.  Batch scheduling in parallel database systems , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[10]  Kien A. Hua,et al.  Including the load balancing issue in the optimization of multi-way join queries for shared-nothing database computers , 1993, [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems.

[11]  Jeffrey F. Naughton,et al.  Query Size Estimation by Adaptive Sampling , 1995, J. Comput. Syst. Sci..

[12]  David J. DeWitt,et al.  Parallel algorithms for the execution of relational database operations , 1983, TODS.

[13]  Michael Stonebraker,et al.  Optimization of parallel query execution plans in XPRS , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[14]  David J. DeWitt,et al.  Nested loops revisited , 1993, [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems.

[15]  Kien A. Hua,et al.  A performance evaluation of load balancing techniques for join operations on multicomputer database systems , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[16]  Erhard Rahm,et al.  Analysis of Dynamic Load Balancing Strategies for Parallel Shared Nothing Database Systems , 1993, VLDB.

[17]  Peter M. G. Apers,et al.  Parallel Evaluation of Multi-join Queries , 1996, ACPC.

[18]  Jeffrey D. Ullman,et al.  Principles Of Database And Knowledge-Base Systems , 1979 .

[19]  Patrick Valduriez,et al.  Join and Semijoin Algorithms for a Multiprocessor Database Machine , 1984, TODS.

[20]  Sumit Ganguly,et al.  Query optimization for parallel execution , 1992, SIGMOD '92.

[21]  David J. DeWitt,et al.  A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment , 1989, SIGMOD '89.

[22]  Wei Hong,et al.  Exploiting inter-operation parallelism in XPRS , 1992, SIGMOD '92.

[23]  Hamid Pirahesh,et al.  Extensible query processing in starburst , 1989, SIGMOD '89.

[24]  Chiang Lee,et al.  Workload balance and page access scheduling for parallel joins in shared-nothing systems , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[25]  Philip S. Yu,et al.  On parallel execution of multiple pipelined hash joins , 1994, SIGMOD '94.

[26]  Mikal Ziane,et al.  Parallel query processing in DBS3 , 1993, [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems.

[27]  Philip S. Yu,et al.  Scheduling and processor allocation for parallel execution of multijoin queries , 1992, [1992] Eighth International Conference on Data Engineering.

[28]  Philip S. Yu,et al.  On optimal processor allocation to support pipelined hash joins , 1993, SIGMOD Conference.

[29]  Jaideep Srivastava,et al.  Optimizing multi-joint queries in parallel relational databases , 1993, [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems.

[30]  Philip S. Yu,et al.  Parallel Execution of Hash Joins in Parallel Databases , 1997, IEEE Trans. Parallel Distributed Syst..

[31]  M. C. Murphy,et al.  Execution plan balancing , 1991, [1991] Proceedings. Seventh International Conference on Data Engineering.