Declustering Databases on Heterogeneous Disk Systems

Declustering is a well known strategy to achieve maximum I/O parallelism in multi-disk systems. Many declustering methods have been proposed for symmetrical disk systems, i.e., multi-disk systems in which all disks have the same speed and capacity. This work deals with the problem of adapting such declustering methods to work in heterogeneous environments. In such environments these are many types of disks and servers with a large range of speeds and capacities. We deal first with the case of perfectly declustered queries, i.e., queries which retrieve a fixed proportion of the answer from each disk. We show that the fraction of the dataset which must be allocated to each disk is affected by both the relative speed and capacity of the disk. Furthermore, the hierarchical structure of most distributed systems, where groups of disks are placed in servers, imposes further complications due to variations . in server and network bandwidths which may affect the actual achievable transfer rates. We propose an algorithm which determines the fraction of the dataset which must be loaded on each disk. The algorithm may be tailored to find disk loading for minimal response time for a given database size, or to compute a system profile showing the optimal loading of the disks for all possible ranges of database sizes. Next we look at the probabilistic aspects of this problem and show how to optimize the expected retrieval time when the Proportions of the data retrieved from each disk axe random variables. We show the rather surprising result that in this case to achieve optimality, the fraction of the data loaded on each disk must not simply be proportional to its speed but rather some compensation must be made with bias towards the faster disks. The methods proposed here are general and can be used in conjunction with most known symmetric declustering methods.

[1]  Shahram Ghandeharizadeh,et al.  Continuous Retrieval of Multimedia Data Using Parallelism , 1993, IEEE Trans. Knowl. Data Eng..

[2]  Christos Faloutsos,et al.  Declustering using fractals , 1993, [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems.

[3]  David J. DeWitt,et al.  Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines , 1990, VLDB.

[4]  Christos Faloutsos,et al.  Disk Allocation Methods Using Error Correcting Codes , 1991, IEEE Trans. Computers.

[5]  Doron Rotem,et al.  Declustering Objects for Visualization , 1993, VLDB.

[6]  Hung-Chang Du Disk allocation methods for binary Cartesian product files , 1986, BIT Comput. Sci. Sect..

[7]  Shahram Ghandeharizadeh,et al.  Object Placement in Parallel Hypermedia Systems , 1991, VLDB.