Evaluating multiple join queries in a distributed database system

It is proposed that the execution of a set of join queries in a distributed environment should be considered cooperatively, rather than as a set of separate requests. With this understanding, a model of multiple query execution in the form of a linear integer program is offered. Several requests are issued to the distributed database management system, each specifying the collation of information comprised of a number of logically distinct data sets, or relations, and dispersed across the sites of a distributed system. Performing these tasks demands the usage of limited resources, so that efficient management commands the smallest additional imposition possible. Both processors and the data communication devices that interconnect them are exploited; an optimal strategy is defined to be one that minimizes a weighted sum of the costs of computation and those of information exchange incurred in resolving the group of queries. Previous models of join query evaluation would regard each individual query in isolation, to produce a sequence of independent execution strategies, one correspondingly for every request. By instead permitting multiple utilization of intermediate computations, any overlap between these queries can be exploited to further reduce the total demand placed on the system as a whole. Through investigations into the character of a number of interacting join computations, performed at a single site in isolation, an earlier single query model [1] can be extended to facilitate the cooperative execution of an entire group.

[1]  Philip A. Bernstein,et al.  Using Semi-Joins to Solve Relational Queries , 1981, JACM.

[2]  Hamdy A. Taha,et al.  Integer Programming: Theory, Applications, and Computations , 1975 .

[3]  Randall L. Hyde,et al.  An Analysis of Degenerate Sharing and False Coherence , 1996, J. Parallel Distributed Comput..

[4]  M. J. Garber,et al.  Introduction to Linear Programming. , 1973 .

[5]  R. J. Dakin,et al.  A tree-search algorithm for mixed integer programming problems , 1965, Comput. J..

[6]  Alfred V. Aho,et al.  The theory of joins in relational data bases , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[7]  Gautam Mitra Investigation of some branch and bound strategies for the solution of mixed integer linear programs , 1973, Math. Program..

[8]  M. W. Orlowski On Optimisation of Joins in Distributed Database System , 1992, Future Databases.

[9]  Laurence A. Wolsey,et al.  Generalized dynamic programming methods in integer programming , 1973, Math. Program..

[10]  Chao-Chih Yang Relational databases , 1985 .

[11]  Catriel Beeri,et al.  On the Desirability of Acyclic Database Schemes , 1983, JACM.

[12]  Ronald Fagin,et al.  Degrees of acyclicity for hypergraphs and relational database schemes , 1983, JACM.

[13]  Sakti Pramanik,et al.  Optimizing Join Queries in Distributed Databases , 1988, IEEE Trans. Software Eng..

[14]  Chihping Wang The complexity of processing tree queries in distributed databases , 1990, Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing 1990.

[15]  Jeffrey D. Ullman,et al.  Principles of Database Systems , 1980 .

[16]  D. J. Reid Optimal distributed execution of join queries , 1994 .

[17]  Jorma Rissanen,et al.  Independent components of relations , 1977, TODS.

[18]  D. J. Reid Incorporating processor costs in optimizing the distributed execution of join queries , 1994 .

[19]  S. Vajda,et al.  Integer Programming and Network Flows , 1970 .

[20]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[21]  Dennis Shasha,et al.  Optimizing equijoin queries in distributed databases where relations are hash partitioned , 1991, TODS.