Caching Strategies for Spatial Joins

The filter-and-refine strategy is well-established as the basis for spatial join algorithms. In contrast to the filter step, the refinement step has received little attention, despite contributing significantly to the total cost of a join evaluation. This paper reports investigations of spatial join algorithms for z-ordering and R-trees, with particular emphasis on interactions between choices of algorithms for the filter, sequencing and refinement steps and on the effects of clustered and unclustered organization of full spatial descriptions of objects. Our experiments show that while it is in general desirable to introduce an additional housekeeping step to reduce I/O costs of the refinement step, it is not necessary in all cases. In addition, we propose a new caching strategy for spatial joins, called zig-zag, which outperforms its competitors in all but one case. These results suggest that spatial joins need caching strategies other than non-spatial ones. Furthermore, our experiments confirm that the choice of the sequencing strategy used is very important and that clustering has a significant influence on join performance.

[1]  Sridhar Ramaswamy,et al.  Scalable Sweeping-Based Spatial Join , 1998, VLDB.

[2]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[3]  Patrick Valduriez,et al.  Join indices , 1987, TODS.

[4]  Michael Stonebraker,et al.  The SEQUOIA 2000 Project , 1993, SSD.

[5]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[6]  Frank Manola,et al.  PROBE Spatial Data Modeling and Query Processing in an Image Database Application , 1988, IEEE Trans. Software Eng..

[7]  Ming-Ling Lo,et al.  The Design and Implementation of Seeded Trees: An Efficient Method for Spatial Joins , 1998, IEEE Trans. Knowl. Data Eng..

[8]  Wolf-Fritz Riekert,et al.  Spatial Access Methods and Query Processing in the Object-Oriented GIS GODOT , 1994, AGDM.

[9]  Oliver Günther,et al.  Efficient computation of spatial joins , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[10]  David J. DeWitt,et al.  Partition based spatial-merge join , 1996, SIGMOD '96.

[11]  Volker Gaede Geometric Information Makes Spatial Query Processing More Efficient , 1995, ACM-GIS.

[12]  H. V. Jagadish,et al.  Linear clustering of objects with multiple attributes , 1990, SIGMOD '90.

[13]  Hans-Peter Kriegel,et al.  The Impact of Global Clustering on Spatial Database Systems , 1994, VLDB.

[14]  Ming-Ling Lo,et al.  Generating Seeded Trees from Data Sets , 1995, SSD.

[15]  Elke A. Rundensteiner,et al.  Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations , 1997, VLDB.

[16]  Jack A. Orenstein Strategies for Optimizing the Use of Redundancy in Spatial Databases , 1989, SSD.

[17]  Ming-Ling Lo,et al.  Spatial hash-joins , 1996, SIGMOD '96.

[18]  Volker Gaede,et al.  Optimal Redundancy in Spatial Database Systems , 1995, SSD.

[19]  Ralf Hartmut Güting,et al.  A practical divide-and-conquer algorithm for the rectangle intersection problem , 1987, Inf. Sci..

[20]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[21]  Hans-Peter Kriegel,et al.  A Storage and Access Architecture for Efficient Query Processing in Spatial Database Systems , 1993, SSD.

[22]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.