A Unified Approach for Indexed and Non-Indexed Spatial Joins

Most spatial join algorithms either assume the existence of a spatial index structure that is traversed during the join process, or solve the problem by sorting, partitioning, or on-the-fly index construction. In this paper, we develop a simple plane-sweeping algorithm that unifies the index-based and non-index based approaches. This algorithm processes indexed as well as non-indexed inputs, extends naturally to multiway joins, and can be built easily from a few standard operations. We present the results of a comparative study of the new algorithm with several index-based and non-index based spatial join algorithms. We consider a number of factors, including the relative performance of CPU and disk, the quality of the spatial indexes, and the sizes of the input relations. An important conclusion from our work is that using an index-based approach whenever indexes are available does not always lead to the best execution time, and hence we propose the use of a simple cost model to decide when to follow an index-based approach.

[1]  H. Buchner The Grid File : An Adaptable , Symmetric Multikey File Structure , 2001 .

[2]  David J. DeWitt,et al.  Partition based spatial-merge join , 1996, SIGMOD '96.

[3]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[4]  Lars Arge,et al.  The Buuer Tree: a New Technique for Optimal I/o-algorithms ? , 1995 .

[5]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[6]  Elke A. Rundensteiner,et al.  Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations , 1997, VLDB.

[7]  Oliver Günther,et al.  Efficient computation of spatial joins , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[8]  Ming-Ling Lo,et al.  Generating Seeded Trees from Data Sets , 1995, SSD.

[9]  Christos Faloutsos,et al.  On packing R-trees , 1993, CIKM '93.

[10]  Uresh K. Vahalia UNIX Internals: The New Frontiers , 1995 .

[11]  Nikos Mamoulis,et al.  Integration of Spatial Join Algorithms for Joining Multiple Inputs , 1999 .

[12]  Christos Faloutsos,et al.  The A dynamic index for multidimensional ob-jects , 1987, Very Large Data Bases Conference.

[13]  Hanan Samet,et al.  A qualitative comparison study of data structures for large line segment databases , 1992, SIGMOD '92.

[14]  Ming-Ling Lo,et al.  Spatial hash-joins , 1996, SIGMOD '96.

[15]  Klaus H. Hinrichs,et al.  A new algorithm for computing joins with grid files , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[16]  David R. Musser,et al.  STL tutorial and reference guide - C++ programming with the standard template library , 1996, Addison-Wesley professional computing series.

[17]  Lars Arge,et al.  The Buffer Tree: A New Technique for Optimal I/O-Algorithms (Extended Abstract) , 1995, WADS.

[18]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[19]  Doron Rotem Spatial join indices , 1991, [1991] Proceedings. Seventh International Conference on Data Engineering.

[20]  Jack A. Orenstein A comparison of spatial query processing techniques for native and parameter spaces , 1990, SIGMOD '90.

[21]  Patrick Valduriez,et al.  Join indices , 1987, TODS.

[22]  Ralf Hartmut Güting,et al.  A practical divide-and-conquer algorithm for the rectangle intersection problem , 1987, Inf. Sci..

[23]  Dimitris Papadias,et al.  Integration of spatial join algorithms for processing multiple inputs , 1999, SIGMOD '99.

[24]  Gerth Stølting Brodal,et al.  Worst-Case External-Memory Priority Queues , 1998, SWAT.

[25]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[26]  Hanan Samet,et al.  Hierarchical Spatial Data Structures , 1989, SSD.

[27]  Ming-Ling Lo,et al.  Spatial joins using seeded trees , 1994, SIGMOD '94.

[28]  Masaru Kitsuregawa,et al.  Join strategies on KD-tree indexed relations , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[29]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.

[30]  Sridhar Ramaswamy,et al.  Selectivity estimation in spatial databases , 1999, SIGMOD '99.

[31]  Sridhar Ramaswamy,et al.  Theory and Practice of I/O-Efficient Algorithms for Multidimensional Batched Searching Problems (Extended Abstract) , 1998, SODA.

[32]  Jyh-Jong Tsay,et al.  External-memory computational geometry , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[33]  Kihong Kim,et al.  Sibling clustering of tree-based spatial indexes for efficient spatial query processing , 1998, International Conference on Information and Knowledge Management.

[34]  David J. DeWitt,et al.  Client-Server Paradise , 1994, VLDB.

[35]  Sridhar Ramaswamy,et al.  Scalable Sweeping-Based Spatial Join , 1998, VLDB.

[36]  Dimitris Papadias,et al.  Processing and optimization of multiway spatial joins using R-trees , 1999, PODS '99.

[37]  Frank Manola,et al.  PROBE Spatial Data Modeling and Query Processing in an Image Database Application , 1988, IEEE Trans. Software Eng..

[38]  Nick Koudas,et al.  Size separation spatial join , 1997, SIGMOD '97.