Adaptive and Incremental Processing for Distance Join Queries

A spatial distance join is a relatively new type of operation introduced for spatial and multimedia database applications. Additional requirements for ranking and stopping cardinality are often combined with the spatial distance join in online query processing or Internet search environments. These requirements pose new challenges as well as opportunities for more efficient processing of spatial distance join queries. In this paper, we first present an efficient k-distance join algorithm that uses spatial indexes such as R-trees. Bidirectional node expansion and plane-sweeping techniques are used for fast pruning of distant pairs, and the plane-sweeping is further optimized by novel strategies for selecting a sweeping axis and direction. Furthermore, we propose adaptive multistage algorithms for k-distance join and incremental distance join operations. Our performance study shows that the proposed adaptive multistage algorithms outperform previous work by up to an order of magnitude for both k-distance, join and incremental distance join queries, under various operational conditions.

[1]  Sunil Arya,et al.  Accounting for boundary effects in nearest neighbor searching , 1995, SCG '95.

[2]  Ming-Ling Lo,et al.  Spatial hash-joins , 1996, SIGMOD '96.

[3]  VitterJeffrey Scott,et al.  Approximate computation of multidimensional aggregates of sparse data using wavelets , 1999 .

[4]  ManolopoulosYannis,et al.  Closest pair queries in spatial databases , 2000 .

[5]  Ming-Ling Lo,et al.  Spatial joins using seeded trees , 1994, SIGMOD '94.

[6]  Christos Faloutsos,et al.  Spatial join selectivity using power laws , 2000, SIGMOD '00.

[7]  Christian S. Jensen Review - R-Trees: A Dynamic Index Structure for Spatial Searching , 1999, ACM SIGMOD Digit. Rev..

[8]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.

[9]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[10]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[11]  Sridhar Ramaswamy,et al.  Scalable Sweeping-Based Spatial Join , 1998, VLDB.

[12]  Christos Faloutsos,et al.  Fast Nearest Neighbor Search in Medical Image Databases , 1996, VLDB.

[13]  Kuldip K. Paliwal,et al.  Fast K-dimensional tree algorithms for nearest neighbor search with application to vector quantization encoding , 1992, IEEE Trans. Signal Process..

[14]  Christian Böhm,et al.  Independent quantization: an index compression technique for high-dimensional data spaces , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[15]  Viswanath Poosala Histogram-Based Estimation Techniques in Database Systems , 1997 .

[16]  David J. DeWitt,et al.  Partition based spatial-merge join , 1996, SIGMOD '96.

[17]  Raghu Ramakrishnan,et al.  Probabilistic Optimization of Top N Queries , 1999, VLDB.

[18]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.

[19]  Hanan Samet,et al.  Incremental distance join algorithms for spatial databases , 1998, SIGMOD '98.

[20]  Michael J. Carey,et al.  On saying “Enough already!” in SQL , 1997, SIGMOD '97.

[21]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[22]  Hanan Samet,et al.  Ranking in Spatial Databases , 1995, SSD.

[23]  Franco P. Preparata,et al.  Sequencing-by-hybridization revisited: the analog-spectrum proposal , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  Jeffrey Scott Vitter,et al.  Approximate computation of multidimensional aggregates of sparse data using wavelets , 1999, SIGMOD '99.

[25]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[26]  Jack A. Orenstein A comparison of spatial query processing techniques for native and parameter spaces , 1990, SIGMOD '90.

[27]  Yannis Manolopoulos,et al.  Closest pair queries in spatial databases , 2000, SIGMOD '00.

[28]  Michael J. Carey,et al.  Reducing the Braking Distance of an SQL Query Engine , 1998, VLDB.

[29]  Hans-Peter Kriegel,et al.  Multi-step processing of spatial joins , 1994, SIGMOD '94.

[30]  Dimitris Papadias,et al.  Processing and optimization of multiway spatial joins using R-trees , 1999, PODS '99.

[31]  Christos Faloutsos,et al.  Estimating the Selectivity of Spatial Queries Using the 'Correlation' Fractal Dimension , 1995, VLDB.

[32]  Hans-Peter Kriegel,et al.  Fast nearest neighbor search in high-dimensional space , 1998, Proceedings 14th International Conference on Data Engineering.

[33]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.