Performance evaluation of processor allocation algorithms for parallel query execution

Two processor allocation algorithms are presented based on a query cost model incorporating the effects of data cormnunication overheads and load imbalance running on a shared nothing parallel architecture. The phase-based algorithm makes use of the heuristic of merge-point evaluation so that the number of operations in each execution phase are distributed evenly. Time equalisation technique is employed within each phase to minimise phase execution time. The non phase-based algorithm is a dynamic approach and its performance is sensitive to the number of processors available. Basea on the algorithm, the operations in the query tree are c :ecuted in such a order that leaf operations are conducted first and processing is ended with the root operation. The concept of optimal degree of parallelism for each operation is also introduced and the new algorithms are evaluated with a simulation study. The experimental results show that the new algorithms outperform the conventional processor allocation algorithms. In addition, the phase based algorithm always provides a local phase optimisation and the non phase based algorithm gives a global optimal solution when there is a large number of processors available.

[1]  Philip S. Yu,et al.  Effectiveness of Parallel Joins , 1990, IEEE Trans. Knowl. Data Eng..

[2]  Arthur M. Keller,et al.  Adaptive parallel hash join in main-memory databases , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[3]  Allan Gottlieb,et al.  Highly parallel computing (2nd ed.) , 1994 .

[4]  Clement H. C. Leung,et al.  A high-performance parallel database architecture , 1993, ICS '93.

[5]  Kien A. Hua,et al.  Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning , 1991, VLDB.

[6]  Sumit Ganguly,et al.  Query optimization for parallel execution , 1992, SIGMOD '92.

[7]  Yi Jiang,et al.  Query Execution in the Presence of Data Skew in Parallel Databases , 1996, Australasian Database Conference.

[8]  Arnold L. Rosenberg,et al.  Scattering and Gathering Messages in Networks of Processors , 1993, IEEE Trans. Computers.

[9]  Rajeev Motwani,et al.  Optimization Algorithms for Exploiting the Parallelism-Communication Tradeoff in Pipelined Parallelism , 1994, VLDB.

[10]  David J. DeWitt,et al.  Practical Skew Handling in Parallel Joins , 1992, VLDB.

[11]  Yi Jiang,et al.  Site Allocation for Parallel Query Execution in Locally Distributed Databases , 1995, Parallel and Distributed Computing and Systems.

[12]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[13]  Peter M. G. Apers,et al.  Parallelism in a Main-Memory DBMS: The Performance of PRISMA/DB , 1992, VLDB.

[14]  Pierre America,et al.  Parallel Database Systems , 1991 .

[15]  Masaru Kitsuregawa,et al.  Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC) , 1990, VLDB.

[16]  Kian-Lee Tan,et al.  Multi-Join Optimization for Symmetric Multiprocessors , 1993, VLDB.

[17]  Allan Gottlieb,et al.  Highly parallel computing , 1989, Benjamin/Cummings Series in computer science and engineering.

[18]  Yousef Saad,et al.  Data communication in parallel architectures , 1989, Parallel Comput..

[19]  Wei Hong Parallel Query Processing Using Shared Memory Multiprocessors and Disk Arrays , 1992 .

[20]  Philip S. Yu,et al.  A Parallel Hash Join Algorithm for Managing Data Skew , 1993, IEEE Trans. Parallel Distributed Syst..

[21]  Yi Jiang,et al.  Taxonomy of skew in parallel databases , 1994 .

[22]  Rajeev Motwani,et al.  Scheduling problems in parallel query optimization , 1995, PODS '95.

[23]  Jim Gray Parallel database systems 101 , 1995, SIGMOD '95.