Optimizing Database Accesses for Parallel Processing of Multikey Range Searches

A multikey search query has many qualified records because it specifies some of the attributes and the rest of them are unspecified. Thus, data distribution among the parallel devices is important to enhance access concurrency for multikey search queries. Though there are some heuristics on data distribution for multikey accesses, range specification in a query has not been considered in those methods. In this paper we investigate optimal data distribution for multiattribute range queries in parallel processing file systems. We show for various-types of multiattribute range queries that there are inherent limitations to achieve optimal distribution. The results show that optimal distribution does not exist in many cases even for files with two fields. We give sufficient conditions for the nonexistence of perfect optimal distribution for certain types of multiattribute range queries. We have also developed data distribution methods for several useful multiattribute range queries. Sufficient conditions for optimal distribution by these proposed data distribution methods are given. It will be shown that the proposed distribution methods are perfect optimal for certain types of multiattribute range queries, and strict optimal for a large class of multiattribute range queries.