A Pipeline N-way Join Algorithm Based on the 2-way Semijoin Program

The semijoin has been used as an effective operator in reducing data transmission and processing over a network that allows forward size reduction of relations and intermediate results generated during the processing of a distributed query. The authors propose a relational operator, two-way semijoin, which enhanced the semijoin with backward size reduction capability for more cost-effective query processing. A pipeline N-way join algorithm for joining the reduced relations residing on N sites is introduced. The main advantage of this algorithm is that it eliminates the need for transferring and storing intermediate results among the sites. A set of experiments showing that the proposed algorithm outperforms all known conventional join algorithms that generate intermediate results is included. >

[1]  Eugene Wong,et al.  Retrieving Dispersed Data from SDD-1: A System for Distributed Databases , 1986, Berkeley Workshop.

[2]  Nick Roussopoulos,et al.  On query-processing in distributed database systems , 1987 .

[3]  N. Roussopoulos Overview of DBMS: a high performance database management system , 1987, FJCC.

[4]  Alan R. Hevner,et al.  The optimization of query processing on distributed database systems , 1979 .

[5]  Wesley W. Chu,et al.  Optimal Query Processing for Distributed Database Systems , 1982, IEEE Transactions on Computers.

[6]  Philip A. Bernstein,et al.  Optimizing Chain Queries in a Distributed Database System , 1984, SIAM J. Comput..

[7]  Nick Roussopoulos,et al.  Preliminary Design of ADMS±: A Workstation-Mainframe Integrated Architecture for Database Management Systems , 1986, VLDB.

[8]  Clement T. Yu,et al.  Adaptive techniques for distributed query optimization , 1986, 1986 IEEE Second International Conference on Data Engineering.

[9]  W. S. Luk,et al.  Optimizing Semi-Join Programs for Distributed Query Processing , 1983, ICOD.

[10]  Arbee L. P. Chen,et al.  Deriving Optimal Semi-Join Programs for Distributed Query Processing , 1984, INFOCOM.

[11]  Nick Roussopoulos,et al.  Principles and Techniques in the Design of ADMS± , 1986, Computer.

[12]  Stefano Ceri,et al.  Distributed Databases: Principles and Systems , 1984 .

[13]  Jo-Mei Chang A Heuristic Approach to Distributed Query Processing , 1982, VLDB.

[14]  Philip A. Bernstein,et al.  Using Semi-Joins to Solve Relational Queries , 1981, JACM.

[15]  Alan R. Hevner,et al.  Query Processing in Distributed Database System , 1979, IEEE Transactions on Software Engineering.

[16]  Nick Roussopoulos,et al.  Interoperability of multiple autonomous databases , 1990, CSUR.

[17]  Nick Roussopoulos,et al.  Optimal view caching , 1990, Inf. Syst..

[18]  Yu-Chi Ho,et al.  A methodology for interpreting tree queries into optimal semi-join expressions , 1980, SIGMOD '80.

[19]  Clement T. Yu,et al.  Promising Approach to Distributed Query Processing , 1982, Berkeley Workshop.

[20]  Arbee L. P. Chen,et al.  Optimizing Star Queries in a Distributed Database System , 1984, VLDB.

[21]  S. B. Yao,et al.  Optimization Algorithms for Distributed Queries , 1986, IEEE Transactions on Software Engineering.

[22]  Eugene Wong,et al.  Query processing in a system for distributed databases (SDD-1) , 1981, TODS.

[23]  Arbee L. P. Chen,et al.  Improvement Algorithms for Semijoin Query Processing Programs in Distributed Database Systems , 1984, IEEE Transactions on Computers.

[24]  David J. DeWitt,et al.  Benchmarking Database Systems A Systematic Approach , 1983, VLDB.

[25]  Guy M. Lohman,et al.  Optimizer Validation and Performance Evaluation for Distributed Queries , 1998 .