Threshold-based declustering

Declustering techniques reduce query response time through parallel I/O by distributing data among multiple devices. Except for a few cases it is not possible to find declustering schemes that are optimal for all spatial range queries. As a result of this, most of the research on declustering has focused on finding schemes with low worst case additive error. However, additive error based schemes have many limitations including lack of progressive guarantees and existence of small non-optimal queries. In this paper, we take a different approach and propose threshold-based declustering. We investigate the threshold k such that all spatial range queries with =

[1]  Hakan Ferhatosmanoglu,et al.  Efficient parallel processing of range queries through replicated declustering , 2006, Distributed and Parallel Databases.

[2]  Jianzhong Li,et al.  CMD : A Multidimensional Declustering Method for Parallel Database Systems 1 , 1992 .

[3]  Christian Böhm,et al.  Fast parallel similarity search in multimedia databases , 1997, SIGMOD '97.

[4]  Randeep Bhatia,et al.  Asymptotically optimal declustering schemes for 2-dim range queries , 2003, Theor. Comput. Sci..

[5]  Ali Saman Tosun Design theoretic approach to replicated declustering , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[6]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[7]  Randeep Bhatia,et al.  Hierarchical Declustering Schemes for Range Queries , 2000, EDBT.

[8]  Joel H. Saltz,et al.  Study of scalable declustering algorithms for parallel grid files , 1996, Proceedings of International Conference on Parallel Processing.

[9]  Christos Faloutsos,et al.  Declustering using error correcting codes , 1989, PODS '89.

[10]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[11]  Cevdet Aykanat,et al.  Iterative-improvement-based declustering heuristics for multi-disk databases , 2005, Inf. Syst..

[12]  Mikhail J. Atallah,et al.  (Almost) Optimal parallel block access for range queries , 2003, Inf. Sci..

[13]  Ali Saman Tosun,et al.  Replicated declustering for arbitrary queries , 2004, SAC '04.

[14]  Hakan Ferhatosmanoglu,et al.  Replicated declustering of spatial data , 2004, PODS '04.

[15]  Ali Saman Tosun Efficient retrieval of replicated data , 2006, Distributed and Parallel Databases.

[16]  Shashi Shekhar,et al.  Partitioning Similarity Graphs: A Framework for Declustering Problems , 1996, Inf. Syst..

[17]  Randeep Bhatia,et al.  Declustering using golden ratio sequences , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[18]  Jim Gray,et al.  Parity Striping of Disk Arrays: Low-Cost Reliable Storage with Acceptable Throughput , 1990, VLDB.

[19]  Divyakant Agrawal,et al.  Concentric hyperspaces and disk allocation for fast parallel range searching , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[20]  Christos Faloutsos,et al.  Declustering using fractals , 1993, [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems.

[21]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[22]  David J. DeWitt,et al.  Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines , 1990, VLDB.

[23]  Randeep Bhatia,et al.  Asymptotically Optimal Declustering Schemes for Range Queries , 2001, ICDT.

[24]  John S. Sobolewski,et al.  Disk allocation for Cartesian product files on multiple-disk systems , 1982, TODS.

[25]  Mikhail J. Atallah,et al.  Optimal Parallel I/O for Range Queries through Replication , 2002, DEXA.

[26]  David J. DeWitt,et al.  A performance analysis of alternative multi-attribute declustering strategies , 1992, SIGMOD '92.

[27]  Keith B. Frikken Optimal Distributed Declustering Using Replication , 2005, ICDT.

[28]  Christine T. Cheng,et al.  Replication and retrieval strategies of multidimensional data on parallel disks , 2003, CIKM '03.

[29]  Hakan Ferhatosmanoglu,et al.  Optimal parallel I/O using replication , 2002, Proceedings. International Conference on Parallel Processing Workshop.

[30]  Sakti Pramanik,et al.  Optimal file distribution for partial match retrieval , 1988, SIGMOD '88.

[31]  Jaideep Srivastava,et al.  CMD: A Multidimensional Declustering Method for Parallel Data Systems , 1992, VLDB.

[32]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[33]  Khaled A. S. Abdel-Ghaffar,et al.  Cyclic allocation of two-dimensional data , 1998, Proceedings 14th International Conference on Data Engineering.

[34]  Paolo Ciaccia,et al.  Dynamic Declustering Methods for Parallel Grid Files , 1996, ACPC.

[35]  Divyakant Agrawal,et al.  Efficient disk allocation for fast similarity searching , 1998, SPAA '98.

[36]  Mei-Yu Wu,et al.  A Hypergraph Based Approach to Declustering Problems , 2004, Distributed and Parallel Databases.

[37]  Christine T. Cheng,et al.  From discrepancy to declustering: near-optimal multidimensional declustering strategies for range queries , 2002, PODS '02.

[38]  Khaled A. S. Abdel-Ghaffar,et al.  Optimal Allocation of Two-Dimensional Data , 1997, ICDT.