On Multi-way Spatial Joins with Direction Predicates

Spatial joins are fundamental in spatial databases. Over the last decade, the primary focus of research has been on joins with the predicate "region intersection." In modern database applications involving geospatial data such as GIS, efficient evaluation of joins with other spatial predicates is yet to be fully explored. In addition, most existing join algorithms were developed for two-way joins. Traditionally, a multi-way join is treated as a sequence of two-way joins. The goal of this paper is to study evaluation of multi-way spatial joins with direction predicates: complexity bounds and efficient algorithms. We first give I/O efficient plane sweeping based algorithms for 2-way direction joins and show that by combining the plane sweeping technique with external priority search trees, a 2-way direction join of N-tuple relations can be evaluated in O(N logbN/M + k) I/Os in the worst case, where M is the size of the memory, b is the page size and k is the result size. The algorithms are then extended to perform a subclass of multi-way direction joins called "star joins". We show that the I/O complexity of evaluating an m-way star join of N-tuple relations is O(mN logb N/M +K+k), where K ? mN2 is the size of the intermediate result, M, b and k (? Nm) are the same as above. We also apply the algorithm for star joins to evaluate a more general case of multi-way joins, which are star connections of star joins and show that this can be done in polynomial time. In the general case, we show that testing emptiness of a multi-way direction join is NP-complete. This lower bound holds even when in the join predicate (1) only one attribute for each relation is involved, and (2) each spatial attribute occurs a bounded number of times. It implies that join evaluation in these cases is NP-hard.

[1]  Jeffrey Scott Vitter,et al.  External Memory Algorithms: Dealing With Massive Data , 2002 .

[2]  Timos K. Sellis,et al.  The Semantics of Relations in 2D Space Using Representative Points: Spatial Indexes , 1993, COSIT.

[3]  Ming-Ling Lo,et al.  Spatial hash-joins , 1996, SIGMOD '96.

[4]  Hanan Samet,et al.  Incremental distance join algorithms for spatial databases , 1998, SIGMOD '98.

[5]  Jeffrey Scott Vitter,et al.  On two-dimensional indexability and optimal range search indexing , 1999, PODS '99.

[6]  Klaus H. Hinrichs,et al.  A new algorithm for computing joins with grid files , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[7]  Lars Arge,et al.  The Buffer Tree: A New Technique for Optimal I/O-Algorithms (Extended Abstract) , 1995, WADS.

[8]  Doron Rotem Spatial join indices , 1991, [1991] Proceedings. Seventh International Conference on Data Engineering.

[9]  Oliver Günther,et al.  Efficient computation of spatial joins , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[10]  Jeffrey Scott Vitter,et al.  Optimal dynamic interval management in external memory , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[11]  Jeffrey Scott Vitter External memory algorithms , 1998, PODS '98.

[12]  Edward M. McCreight,et al.  Priority Search Trees , 1985, SIAM J. Comput..

[13]  Oscar H. Ibarra,et al.  An index structure for spatial joins in linear constraint databases , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[14]  Oscar H. Ibarra,et al.  Extending Rectangle Join Algorithms for Rectilinear Polygons , 2000, Web-Age Information Management.

[15]  Oscar H. Ibarra,et al.  Toward spatial joins for polygons , 2000, Proceedings. 12th International Conference on Scientific and Statistica Database Management.

[16]  Dimitris Papadias,et al.  Processing and optimization of multiway spatial joins using R-trees , 1999, PODS '99.

[17]  David J. DeWitt,et al.  Partition based spatial-merge join , 1996, SIGMOD '96.

[18]  Daniel Hernández,et al.  Maintaining Qualitative Spatial Knowledge , 1993, COSIT.

[19]  Andrew U. Frank,et al.  Qualitative spatial reasoning about distances and directions in geographic space , 1992, J. Vis. Lang. Comput..

[20]  Dimitris Papadias,et al.  Integration of spatial join algorithms for processing multiple inputs , 1999, SIGMOD '99.

[21]  Timos K. Sellis,et al.  The Retrieval of Direction Relations using R-trees , 1994, DEXA.

[22]  Jyh-Jong Tsay,et al.  External-memory computational geometry , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[23]  Rakesh Agrawal,et al.  Parallel Algorithms for High-dimensional Similarity Joins for Data Mining Applications , 1997, Very Large Data Bases Conference.

[24]  Michel Scholl,et al.  A Performance Evaluation of Spatial Join Processing Strategies , 1999, SSD.

[25]  Elke A. Rundensteiner,et al.  Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations , 1997, VLDB.

[26]  Sridhar Ramaswamy,et al.  Scalable Sweeping-Based Spatial Join , 1998, VLDB.

[27]  Lars Arge,et al.  The Buuer Tree: a New Technique for Optimal I/o-algorithms ? , 1995 .

[28]  Jano Moreira de Souza,et al.  A Raster Approximation For Processing of Spatial Joins , 1998, VLDB.

[29]  Soumitra Dutta,et al.  Qualitative Spatial Reasoning: A Semi-quantitative Approach Using Fuzzy Logic , 1989, SSD.

[30]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.

[31]  Dimitris Papadias,et al.  Constraint-based algorithms for computing clique intersection joins , 1998, GIS '98.

[32]  Michael Ian Shamos,et al.  Geometric intersection problems , 1976, 17th Annual Symposium on Foundations of Computer Science (sfcs 1976).