Outerjoins as disjunctions

The outerjoin operator is currently available in the query language of several major DBMSs, and it is included in the proposed SQL2 standard draft. However, “associativity problems” of the operator have been pointed out since its introduction. In this paper we propose a shift in the intuition behind outerjoin: Instead of computing the join while also preserving its arguments, outerjoin delivers tuples that come either from the join or from the arguments. Queries with joins and outerjoins deliver tuples that come from one out of several joins, where a single relation is a trivial join. An advantage of this view is that, in contrast to preservation, disjunction is commutative and associative, which is a significant property for intuition, formalisms, and generation of execution plans. Based on a disjunctive normal form, we show that some data merging queries cannot be evaluated by means of binary outerjoins, and give alternative procedures to evaluate those queries. We also explore several evaluation strategies for outerjoin queries, including the use of semijoin programs to reduce base relations.

[1]  C. J. Date Relational Database - Selected Writings , 1986 .

[2]  Umeshwar Dayal,et al.  View Definition and Generalization for Database Integration in a Multidatabase System , 1984, IEEE Transactions on Software Engineering.

[3]  Stefano Ceri,et al.  Distributed Databases: Principles and Systems , 1984 .

[4]  Giuseppe Pelagatti,et al.  Formal semantics of SQL queries , 1991, TODS.

[5]  E. F. Codd,et al.  Extending the database relational model to capture more meaning , 1979, ACM Trans. Database Syst..

[6]  Gultekin Özsoyoglu,et al.  Query processing techniques in the summary-table-by-example database query language , 1989, TODS.

[7]  Alain Pirotte,et al.  Generalized joins , 1976, SGMD.

[8]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[9]  M. Muralikrishna Optimization and Dataflow Algorithms for Nested Tree Queries , 1989, VLDB.

[10]  Umeshwar Dayal,et al.  Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers , 1987, VLDB.

[11]  Stuart E. Madnick,et al.  A Polygen Model for Heterogeneous Database Systems: The Source Tagging Perspective , 1990, VLDB.

[12]  Arnon Rosenthal,et al.  Query graphs, implementing trees, and freely-reorderable outerjoins , 1990, SIGMOD '90.

[13]  Arnon Rosenthal,et al.  Extending the Algebraic Framework of Query Processing to Handle Outerjoins , 1984, VLDB.

[14]  Arbee L. P. Chen Outerjoin optimization in multidatabase systems , 1990, DPDS '90.

[15]  C. Galindo-Legaria Algebraic optimization of outerjoin queries , 1992 .

[16]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[17]  Arnon Rosenthal,et al.  How to extend a conventional optimizer to handle one- and two-sided outerjoin , 1992, [1992] Eighth International Conference on Data Engineering.

[18]  Michael M. David Advanced capabilities of the outer join , 1992, SGMD.

[19]  Harry K. T. Wong,et al.  Optimization of nested SQL queries revisited , 1987, SIGMOD '87.

[20]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[21]  Tim Hartley,et al.  Oracle/SQL: A Professional Programmer's Guide , 1992 .

[22]  David W. Shipman,et al.  The functional data model and the data languages DAPLEX , 1981, TODS.

[23]  David W. Shipman The functional data model and the data language DAPLEX , 1979, SIGMOD '79.