Parallel evaluation of multi-join queries

A number of execution strategies for parallel evaluation of multi-join queries have been proposed in the literature; their performance was evaluated by simulation. In this paper we give a comparative performance evaluation of four execution strategies by implementing all of them on the same parallel database system, PRISMA/DB. Experiments have been done up to 80 processors. The basic strategy is to first determine an execution schedule with minimum total cost and then parallelize this schedule with one of the four execution strategies. These strategies, coming from the literature, are named: Sequential Parallel, Synchronous Execution, Segmented Right-Deep, and Full Parallel. Based on the experiments clear guidelines are given when to use which strategy.

[1]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[2]  Mikal Ziane,et al.  Parallel query processing in DBS3 , 1993, [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems.

[3]  Philip S. Yu,et al.  Scheduling and processor allocation for parallel execution of multijoin queries , 1992, [1992] Eighth International Conference on Data Engineering.

[4]  Arun N. Swami,et al.  Optimization of large join queries , 1988, SIGMOD '88.

[5]  Philip S. Yu,et al.  Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins , 1992, VLDB.

[6]  Peter M. G. Apers,et al.  Parallelism in a main-memory system: The performance of PRISMA/DB. , 1992 .

[7]  David J. DeWitt,et al.  Complex query processing in multiprocessor database machines , 1990 .

[8]  Michael Stonebraker,et al.  Optimization of parallel query execution plans in XPRS , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[9]  Michael Stonebraker,et al.  The Design of XPRS , 1988, VLDB.

[10]  Jaideep Srivastava,et al.  Optimizing multi-joint queries in parallel relational databases , 1993, [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems.

[11]  A. N. Wilschut,et al.  Dataflow query execution in a parallel main-memory environment , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[12]  Carlo Zaniolo,et al.  Optimization of Nonrecursive Queries , 1986, VLDB.

[13]  David J. DeWitt,et al.  Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines , 1990, VLDB.

[14]  A. N. Wilschut,et al.  PRISMA DB1 User manual , 1991 .

[15]  Donald D. Chamberlin,et al.  Access Path Selection in a Relational Database Management System , 1989 .

[16]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[17]  Myra Spiliopoulou,et al.  Using Parallelism and Pipeline for the Optimisation of Join Queries , 1992, PARLE.

[18]  Hongjun Lu,et al.  Optimization of Multi-Way Join Queries for Parallel Execution , 1991, VLDB.

[19]  A. N. Wilschut,et al.  Implementation and Performance Evaluation of a Parallel Transitive Closure Algorithm on PRISMA/DB , 1993, VLDB.

[20]  Peter M. G. Apers,et al.  Parallel Query Execution in PRISMA/DB , 1990, PRISMA Workshop.

[21]  Felipe Cariño,et al.  Exegesis of DBC/1012 and P-90 - Industrial Supercomputer Database Machines , 1992, PARLE.

[22]  Annita N. Wilschut,et al.  A Model for Pipelined Query Execution , 1993, MASCOTS.

[23]  David J. DeWitt,et al.  Benchmarking Database Systems A Systematic Approach , 1983, VLDB.

[24]  Paul W. P. J. Grefen,et al.  PRISMA/DB: A Parallel Main Memory Relational DBMS , 1992, IEEE Trans. Knowl. Data Eng..

[25]  Kjell Bratbergsengen,et al.  The Development of the CROSS8 and HC16-186 Parallel (Database) Computers , 1989, IWDM.

[26]  Philip S. Yu,et al.  On parallel execution of multiple pipelined hash joins , 1994, SIGMOD '94.

[27]  Patrick Valduriez,et al.  On the Effectiveness of Optimization Search Strategies for Parallel Execution Spaces , 1993, VLDB.

[28]  Paul W. P. J. Grefen Integrity control in parallel database systems , 1992 .

[29]  David J. DeWitt,et al.  A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment , 1989, SIGMOD '89.

[30]  Peter M. G. Apers,et al.  Understanding large scale parallelism for data management , 1994 .

[31]  Patrick Valduriez,et al.  Prototyping Bubba, A Highly Parallel Database System , 1990, IEEE Trans. Knowl. Data Eng..

[32]  Peter M. G. Apers,et al.  Pipelining in query execution , 1990, Proceedings. PARBASE-90: International Conference on Databases, Parallel Architectures, and Their Applications.

[33]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[34]  Peter M. G. Apers,et al.  Parallelism in a Main-Memory DBMS: The Performance of PRISMA/DB , 1992, VLDB.

[35]  A. N. Wilschut Parallel Query Execution In A Main-Memory Database System , 1993 .