An incremental algorithm for computing ranked full disjunctions

The full disjunction is a variation of the join operator that maximally combines tuples from connected relations, while preserving all information in the relations. The full disjunction can be seen as a natural extension of the binary outerjoin operator to an arbitrary number of relations and is a useful operator for information integration. This paper presents the algorithm IncrementalFD for computing the full disjunction of a set of relations. IncrementalFD improves upon previous algorithms for computing the full disjunction in four ways. First, it has a lower total runtime when computing the full result and a lower runtime when computing only k tuples of the result, for any constant k. Second, for a natural class of ranking functions, IncrementalFD can be adapted to return tuples in ranking order. Third, a variation of IncrementalFD can be used to return approximate full disjunctions (which contain maximal approximately join consistent tuples). Fourth, IncrementalFD can be adapted to have a block-based execution, instead of a tuple-based execution.

[1]  Yehoshua Sagiv,et al.  Full disjunctions: polynomial-delay iterators in action , 2006, VLDB.

[2]  Yehoshua Sagiv,et al.  An incremental algorithm for computing ranked full disjunctions , 2005, PODS '05.

[3]  Dan Suciu,et al.  Answering Queries from Statistics and Probabilistic Views , 2005, VLDB.

[4]  Yehoshua Sagiv,et al.  Computing full disjunctions , 2003, PODS '03.

[5]  Mihalis Yannakakis,et al.  On the Complexity of Testing Implications of Functional and Join Dependencies , 1981, JACM.

[6]  Walid G. Aref,et al.  Rank-aware query optimization , 2004, SIGMOD '04.

[7]  Mihalis Yannakakis,et al.  On Generating All Maximal Independent Sets , 1988, Inf. Process. Lett..

[8]  Jeffrey D. Ullman,et al.  Integrating information by outerjoins and full disjunctions (extended abstract) , 1996, PODS.

[9]  Martín Abadi,et al.  Security analysis of cryptographically controlled access to XML documents , 2005, PODS '05.

[10]  Moshe Y. Vardi On the Complexity of Bounded-Variable Queries. , 1995, PODS 1995.

[11]  John R. Smith,et al.  Supporting Incremental Join Queries on Ranked Inputs , 2001, VLDB.

[12]  Yehoshua Sagiv,et al.  An Abstract Framework for Generating Maximal Answers to Queries , 2005, ICDT.

[13]  Anand Rajaraman,et al.  Integrating Information by Outerjoins and Full Disjunctions , 1996, PODS 1996.

[14]  César A. Galindo-Legaria,et al.  Outerjoins as disjunctions , 1994, SIGMOD '94.

[15]  Luis Gravano,et al.  Evaluating top-k queries over web-accessible databases , 2004, TODS.

[16]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS '01.

[17]  Mihalis Yannakakis,et al.  Algorithms for Acyclic Database Schemes , 1981, VLDB.

[18]  Walid G. Aref,et al.  Supporting top-kjoin queries in relational databases , 2004, The VLDB Journal.

[19]  Moshe Y. Vardi On the complexity of bounded-variable queries (extended abstract) , 1995, PODS '95.