CMD : A Multidimensional Declustering Method for Parallel Database Systems 1

I/O parallelism appears to be a promising approach to achieving high performance in parallel database systems. In such systems, it is essential to decluster database les into fragments and spread them across multiple disks so that the DBMS software can exploit the I/O bandwidth reading and writing the disks in parallel. In this paper, we consider the problem of declustering multidimensional data on a parallel disk system. Since the multidimensional range query is the main work-horse for applications accessing such data, our aim is to provide e cient support for it. A new declustering method for parallel disk systems, called coordinate modulo distribution (CMD), is proposed. Our analysis shows that the method achieves optimum parallelism for a very high percentage of range queries on multidimensional data, if the distribution of data on each dimension is stationary. We have derived the exact conditions under which optimality is achieved. Also provided are the worst and average case bounds on multidimensional range query performance. Experimental results show that the method achieves near optimum performance in almost all cases even when the stationarity assumption does not hold. Details of the parallel algorithms for range query processing and data maintenance are also provided.