Disk Allocation Methods Using Error Correcting Codes

The problem of declustering, that is, how to distribute a binary Cartesian product file on multiple disks to maximize the parallelism for partial match queries, is examined. Cartesian product files appear as a result of some secondary key access methods. For the binary case, the problem is reduced to grouping the 2/sup n/ binary strings on n bits in m groups of unsimilar strings. It is proposed that the strings be grouped such that these group forms an error correcting code (ECC). This construction guarantees that the strings of a given group will have large Hamming distances, i.e., they will differ in many bit positions. Intuitively, this should result in good declustering. The authors describe how to build a declustering scheme using an ECC, and prove a theorem that gives a necessary condition for the proposed method to be optimal. Analytical results show that the proposed method is superior to older heuristics, and that it is very close to the theoretical (nontight) bound. >