Closest pair queries in spatial databases

This paper addresses the problem of finding the K closest pairs between two spatial data sets, where each set is stored in a structure belonging in the R-tree family. Five different algorithms (four recursive and one iterative) are presented for solving this problem. The case of 1 closest pair is treated as a special case. An extensive study, based on experiments performed with synthetic as well as with real point data sets, is presented. A wide range of values for the basic parameters affecting the performance of the algorithms, especially the effect of overlap between the two data sets, is explored. Moreover, an algorithmic as well as an experimental comparison with existing incremental algorithms addressing the same problem is presented. In most settings, the new algorithms proposed clearly outperform the existing ones.

[1]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[2]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[3]  Michael Stonebraker,et al.  The Sequoia 2000 Benchmark , 1993, SIGMOD Conference.

[4]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.

[5]  Michael Stonebraker,et al.  The SEQUOIA 2000 storage benchmark , 1993, SIGMOD '93.

[6]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[7]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[8]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[9]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[10]  Dimitris Papadias,et al.  Integration of spatial join algorithms for processing multiple inputs , 1999, SIGMOD '99.

[11]  Yannis Manolopoulos,et al.  Performance of Nearest Neighbor Queries in R-Trees , 1997, ICDT.

[12]  Martti Penttonen,et al.  A Reliable Randomized Algorithm for the Closest-Pair Problem , 1997, J. Algorithms.

[13]  Hanan Samet,et al.  Incremental distance join algorithms for spatial databases , 1998, SIGMOD '98.

[14]  Derek Thompson,et al.  Fundamentals of spatial information systems , 1992, A.P.I.C. series.

[15]  Yannis Manolopoulos,et al.  Advanced Database Indexing , 1999, Advances in Database Systems.

[16]  Dimitris Papadias,et al.  Processing and optimization of multiway spatial joins using R-trees , 1999, PODS '99.

[17]  David J. DeWitt,et al.  Client-Server Paradise , 1994, VLDB.

[18]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[19]  Elke A. Rundensteiner,et al.  A cost model for estimating the performance of spatial joins using R-trees , 1997, Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150).

[20]  Jyh-Jong Tsay,et al.  External-memory computational geometry , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[21]  Mario A. López,et al.  The effect of buffering on the performance of R-trees , 1998, Proceedings 14th International Conference on Data Engineering.

[22]  Timos K. Sellis,et al.  Cost models for join queries in spatial databases , 1998, Proceedings 14th International Conference on Data Engineering.

[23]  Yannis Manolopoulos,et al.  Algorithms for Joining R-Trees and Linear Region Quadtrees , 1999, SSD.

[24]  Yannis Manolopoulos,et al.  Nearest Neighbor Queries in Shared-Nothing Environments , 1997, GeoInformatica.

[25]  David J. DeWitt,et al.  Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation , 1997, SIGMOD '97.

[26]  Klaus H. Hinrichs,et al.  A Plane-Sweep Algorithm for Finding a Closest Pair Among Convex Planar Objects , 1992, STACS.

[27]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[28]  Samir Khuller,et al.  A Simple Randomized Sieve Algorithm for the Closest-Pair Problem , 1995, Inf. Comput..