Distributed optimization of cyclic queries with parallel semijoins

We consider the problem of finding (possibly optimal) semijoin sequences that (fully) reduce the relations referenced in a cyclic query graph. We propose a combination of parallel and sequential semijoin operations to minimize the amount of data transmission in distributed query processing. We report on experiments that show that our approach is not only efficient but also effective in reducing the total amount of data transmission.