Using Semi-Joins to Solve Relational Queries

The semi-join is a relational algebraic operation that selects a set of tuples in one relation that match one or more tuples of another relation on the joining domains. Semi-joins have been used as a basic ingredient in query processing strategies for a number of hardware and software database systems. However, not all queries can be solved entirely using semi-joins. In this paper the exact class of relational queries that can be solved using semi-joins is shown. It is also shown that queries outside of this class may not even be partially solvable using "short" semi-join programs. In addition, a linear-time membership test for this class is presented.