Using semiouterjoins to process queries in multidatabase systems

A multidatabase system provides a logically integrated view of existing, possibly inconsistent, databases. Logical integration is achieved primarily through the use of generalization, which can be modelled algebraically as a sequence of outerjoin and aggregation operations. Conventional distributed query processing techniques are inadequate for processing queries over views defined by outerjoins and aggregates. In a conventional distributed database system, selections and projections are inexpensive to process; hence joins have been the rocus of most previous research. In a multidatabase system, however, even selections and projections can be as expensive as joins. The semiouterjoin operation can potentially reduce query processing costs. In general, there may be many different strategies based on semiouterjoins for processing a given query. The query optimization problem is to choose the most profitable of these strategies. This paper studies the query optimization problem for selection and projection queries. It develops linear-time solutions to the problem, and then extends these solutions to provide heuristics for joins and conjunctive queries.

[1]  Eugene Wong,et al.  Multibase: integrating heterogeneous distributed database systems , 1981, AFIPS '81.

[2]  Patricia G. Selinger,et al.  Access Path Selection in Distributed Database Management Systems , 1980, ICOD.

[3]  Randy H. Katz,et al.  View Processing in MULTIBASE, A Heterogeneous Database System , 1981, ER.

[4]  Eugene Wong,et al.  Query processing in a system for distributed databases (SDD-1) , 1981, TODS.

[5]  Diane C. P. Smith,et al.  Database abstractions: aggregation and generalization , 1977, TODS.

[6]  Terry A. Landers,et al.  An Overview of MULTIBASE , 1986, DDB.

[7]  Marco A. Casanova,et al.  Towards a sound view integration methodology , 1983, PODS.

[8]  Umeshwar Dayal,et al.  Processing Queries Over Generalization Hierarchies in a Multidatabase System , 1983, VLDB.

[9]  Hai-Yann Hwang Database integration and query optimization in multi-database systems , 1982 .

[10]  Philip A. Bernstein,et al.  Using Semi-Joins to Solve Relational Queries , 1981, JACM.

[11]  Alan R. Hevner,et al.  Query Processing in Distributed Database System , 1979, IEEE Transactions on Software Engineering.

[12]  Michael Stonebraker,et al.  Distributed query processing in a relational data base system , 1978, SIGMOD Conference.

[13]  Clement T. Yu,et al.  An algorithm for tree-query membership of a distributed query , 1979, COMPSAC.

[14]  E. F. Codd,et al.  Extending the database relational model to capture more meaning , 1979, ACM Trans. Database Syst..

[15]  Umeshwar Dayal,et al.  View Definition and Generalization for Database Integration in Multibase: A System for Heterogeneous Distributed Databases , 1982, Berkeley Workshop.