Workload balance and page access scheduling for parallel joins in shared-nothing systems

A methodology to resolve balancing and scheduling issues for parallel join execution in a shared-nothing multiprocessor environment are presented. In the past, research on parallel join methods focused on the design of algorithms for partitioning relations and distributing data buckets as evenly as possible to the processors. Once data are uniformly distributed to the processors, it is assumed that all processors will complete their tasks at about the same time. The authors stress that this is true if no further information, such as page-level join index, is available. Otherwise, the join execution can be further optimized and the workload in the processors may still be unbalanced. The authors study these problems in a shared-nothing environment.<<ETX>>

[1]  Patrick Valduriez,et al.  Prototyping Bubba, A Highly Parallel Database System , 1990, IEEE Trans. Knowl. Data Eng..

[2]  Masaru Kitsuregawa,et al.  Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC) , 1990, VLDB.

[3]  D. Rotem,et al.  Processor scheduling for multiprocessor joins , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[4]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[5]  Doron Rotem Spatial join indices , 1991, [1991] Proceedings. Seventh International Conference on Data Engineering.

[6]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multikey File Structure , 1984, TODS.

[7]  Patrick Valduriez,et al.  Join indices , 1987, TODS.

[8]  Masaru Kitsuregawa,et al.  Join strategies on KD-tree indexed relations , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[9]  Hongjun Lu,et al.  Dynamic and Load-balanced Task-Oriented Datbase Query Processing in Parallel Systems , 1992, EDBT.

[10]  Hongjun Lu,et al.  Hash-based join algorithms for multiprocessor computers with shared memory , 1990, VLDB 1990.

[11]  Adel Said Elmaghraby,et al.  Letter to the editor , 2018, Journal of Orofacial Orthopedics-fortschritte Der Kieferorthopadie.

[12]  Sakti Pramanik,et al.  Use of graph-theoretic models for optimal relational database accesses to perform join , 1985, TODS.

[13]  Kien A. Hua,et al.  Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning , 1991, VLDB.

[14]  David J. DeWitt,et al.  A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment , 1989, SIGMOD '89.

[15]  David J. DeWitt,et al.  Parallel algorithms for the execution of relational database operations , 1983, TODS.

[16]  Doron Rotem,et al.  Effective Resource Utilization for Multiprocessor Join Execution , 1989, VLDB.

[17]  Kien A. Hua,et al.  An Adaptive Data Placement Scheme for Parallel Database Computer Systems , 1990, VLDB.

[18]  David J. DeWitt,et al.  GAMMA - A High Performance Dataflow Database Machine , 1986, VLDB.

[19]  David J. DeWitt,et al.  Chained declustering: a new availability strategy for multiprocessor database machines , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[20]  Hon Fung Li,et al.  Scheduling of page fetches in join operations using B/sub c/-trees , 1988, Proceedings. Fourth International Conference on Data Engineering.

[21]  M. Kitsuregawa,et al.  Architecture and performance of relational algebra machine GRACE , 1989 .