Multi-Site Retrieval of Declustered Data

Declustering techniques reduce query response times through parallel I/O by distributing data among multiple devices. Recently, replication based approaches were proposed to further reduce the response time. All of the replication based schemes assume that replication is done at a single site. In this paper, we consider replicated data stored at multiple sites. We formulate multi-site retrieval problem as a maximum flow problem and solve it using maximum flow techniques. We propose a low complexity online algorithm for the problem. We investigate the proposed scheme using various replication schemes, query types and query loads. Proposed scheme can easily be extended to nonuniform data and to any number of sites. Experimental results show that replication using orthogonal allocation performs the best under various settings.

[1]  Christine T. Cheng,et al.  From discrepancy to declustering: near-optimal multidimensional declustering strategies for range queries , 2002, PODS '02.

[2]  Peter Sanders,et al.  Fast Concurrent Access to Parallel Disks , 2000, SODA '00.

[3]  Mikhail J. Atallah,et al.  (Almost) Optimal parallel block access for range queries , 2003, Inf. Sci..

[4]  Randeep Bhatia,et al.  Hierarchical Declustering Schemes for Range Queries , 2000, EDBT.

[5]  Ali Saman Tosun Threshold Based Declustering in High Dimensions , 2005, DEXA.

[6]  Khaled A. S. Abdel-Ghaffar,et al.  Optimal Allocation of Two-Dimensional Data , 1997, ICDT.

[7]  Ali Saman Tosun,et al.  Replicated declustering for arbitrary queries , 2004, SAC '04.

[8]  Mikhail J. Atallah,et al.  Replicated Parallel I/O without Additional Scheduling Costs , 2003, DEXA.

[9]  Divyakant Agrawal,et al.  Efficient disk allocation for fast similarity searching , 1998, SPAA '98.

[10]  Jiuqiang Liu,et al.  Latin cubes and parallel array access , 1994, Proceedings of 8th International Parallel Processing Symposium.

[11]  Khaled A. S. Abdel-Ghaffar,et al.  Cyclic allocation of two-dimensional data , 1998, Proceedings 14th International Conference on Data Engineering.

[12]  Hakan Ferhatosmanoglu,et al.  Efficient parallel processing of range queries through replicated declustering , 2006, Distributed and Parallel Databases.

[13]  Ali Saman Tosun Design theoretic approach to replicated declustering , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[14]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[15]  Ali Saman Tosun Constrained declustering , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[16]  Doron Rotem,et al.  Optimal response time retrieval of replicated data (extended abstract) , 1994, PODS '94.

[17]  Doron Rotem,et al.  Optimal Response Time Retrieval of Replicated Data. , 1994, PODS 1994.

[18]  Mikhail J. Atallah,et al.  Optimal Parallel I/O for Range Queries through Replication , 2002, DEXA.

[19]  Keith B. Frikken Optimal Distributed Declustering Using Replication , 2005, ICDT.

[20]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[21]  Christine T. Cheng,et al.  Replication and retrieval strategies of multidimensional data on parallel disks , 2003, CIKM '03.

[22]  Viktor K. Prasanna,et al.  Latin Squares for Parallel Array Access , 1993, IEEE Trans. Parallel Distributed Syst..

[23]  Hakan Ferhatosmanoglu,et al.  Optimal parallel I/O using replication , 2002, Proceedings. International Conference on Parallel Processing Workshop.

[24]  A. Guttmma,et al.  R-trees: a dynamic index structure for spatial searching , 1984 .

[25]  David J. DeWitt,et al.  A multiuser performance analysis of alternative declustering strategies , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[26]  Ali Saman Tosun Analysis and Comparison of Replicated Declustering Schemes , 2007, IEEE Transactions on Parallel and Distributed Systems.

[27]  Hakan Ferhatosmanoglu,et al.  Replicated declustering of spatial data , 2004, PODS '04.

[28]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[29]  David J. DeWitt,et al.  Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines , 1990, VLDB.

[30]  Ali Saman Tosun Threshold-based declustering , 2007, Inf. Sci..