Multiway spatial joins

Due to the evolution of Geographical Information Systems, large collections of spatial data having various thematic contents are currently available. As a result, the interest of users is not limited to simple spatial selections and joins, but complex query types that implicate numerous spatial inputs become more common. Although several algorithms have been proposed for computing the result of pairwise spatial joins, limited work exists on processing and optimization of multiway spatial joins. In this article, we review pairwise spatial join algorithms and show how they can be combined for multiple inputs. In addition, we explore the application of synchronous traversal (ST), a methodology that processes synchronously all inputs without producing intermediate results. Then, we integrate the two approaches in an engine that includes ST and pairwise algorithms, using dynamic programming to determine the optimal execution plan. The results show that, in most cases, multiway spatial joins are best processed by combining ST with pairwise methods. Finally, we study the optimization of very large queries by employing randomized search algorithms.

[1]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[2]  Sridhar Ramaswamy,et al.  Scalable Sweeping-Based Spatial Join , 1998, VLDB.

[3]  Hanan Samet,et al.  Benchmarking Spatial Join Operations with Spatial Output , 1995, VLDB.

[4]  Dimitris Papadias,et al.  Algorithms for Querying by Spatial Structure , 1998, VLDB.

[5]  Ming-Ling Lo,et al.  Spatial hash-joins , 1996, SIGMOD '96.

[6]  Theodore Bially,et al.  Space-filling curves: Their generation and their application to bandwidth reduction , 1969, IEEE Trans. Inf. Theory.

[7]  Chan-Gun Lee,et al.  Early separation of filter and refinement steps in spatial query optimization , 1999, Proceedings. 6th International Conference on Advanced Systems for Advanced Applications.

[8]  Peter van Beek,et al.  A Theoretical Evaluation of Selected Backtracking Algorithms , 1995, IJCAI.

[9]  Hanan Samet,et al.  Incremental distance join algorithms for spatial databases , 1998, SIGMOD '98.

[10]  Timos K. Sellis,et al.  Cost models for join queries in spatial databases , 1998, Proceedings 14th International Conference on Data Engineering.

[11]  Arun N. Swami,et al.  Optimization of large join queries , 1988, SIGMOD '88.

[12]  David J. DeWitt,et al.  Partition based spatial-merge join , 1996, SIGMOD '96.

[13]  Doron Rotem Spatial join indices , 1991, [1991] Proceedings. Seventh International Conference on Data Engineering.

[14]  Yannis E. Ioannidis,et al.  Randomized algorithms for optimizing large join queries , 1990, SIGMOD '90.

[15]  Timos K. Sellis,et al.  A model for the prediction of R-tree performance , 1996, PODS.

[16]  Michel Scholl,et al.  A Performance Evaluation of Spatial Join Processing Strategies , 1999, SSD.

[17]  Fahiem Bacchus,et al.  On the Forward Checking Algorithm , 1995, CP.

[18]  Sartaj Sahni,et al.  Simulated Annealing and Combinatorial Optimization , 1986, 23rd ACM/IEEE Design Automation Conference.

[19]  David J. DeWitt,et al.  Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation , 1997, SIGMOD '97.

[20]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[21]  Ming-Ling Lo,et al.  Spatial joins using seeded trees , 1994, SIGMOD '94.

[22]  Patrick Valduriez,et al.  Memory-adaptive scheduling for large query execution , 1998, CIKM '98.

[23]  Elke A. Rundensteiner,et al.  Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations , 1997, VLDB.

[24]  Rina Dechter,et al.  Experimental Evaluation of Preprocessing Algorithms for Constraint Satisfaction Problems , 1994, Artif. Intell..

[25]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[26]  Patrick Valduriez,et al.  Join indices , 1987, TODS.

[27]  Dimitris Papadias,et al.  Integration of spatial join algorithms for processing multiple inputs , 1999, SIGMOD '99.

[28]  Hans-Peter Kriegel,et al.  Parallel processing of spatial joins using R-trees , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[29]  Nick Koudas,et al.  Size separation spatial join , 1997, SIGMOD '97.

[30]  Yannis Manolopoulos,et al.  Algorithms for Joining R-Trees and Linear Region Quadtrees , 1999, SSD.

[31]  Dimitris Papadias,et al.  Selectivity Estimation of Complex Spatial Queries , 2001, SSTD.

[32]  Panos Kalnis,et al.  Content-based retrieval using heuristic search , 1999, SIGIR '99.

[33]  Stavros Christodoulakis,et al.  On the propagation of errors in the size of join results , 1991, SIGMOD '91.

[34]  Robert M. Haralick,et al.  Increasing Tree Search Efficiency for Constraint Satisfaction Problems , 1979, Artif. Intell..

[35]  Nick Roussopoulos,et al.  Faloutsos: "the r+- tree: a dynamic index for multidimensional objects , 1987 .

[36]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[37]  Oliver Günther,et al.  Efficient computation of spatial joins , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[38]  Sartaj Sahni,et al.  Simulated Annealing and Combinatorial Optimization , 1986, DAC 1986.

[39]  Elke A. Rundensteiner,et al.  A cost model for estimating the performance of spatial joins using R-trees , 1997, Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150).

[40]  Curt J. Ellmann,et al.  Building a Scalable GeoSpatial DBMS : Technology , Implementation , and Evaluation , 1997 .

[41]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[42]  Dimitris Papadias,et al.  Processing and optimization of multiway spatial joins using R-trees , 1999, PODS '99.

[43]  Martin L. Kersten,et al.  Fast, Randomized Join-Order Selection - Why Use Transformations? , 1994, VLDB.

[44]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.

[45]  Chin-Wan Chung,et al.  Multi-way Spatial Joins Using R-Trees: Methodology and Performance Evaluation , 1999, SSD.

[46]  Sridhar Ramaswamy,et al.  Selectivity estimation in spatial databases , 1999, SIGMOD '99.

[47]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[48]  Dimitris Papadias,et al.  Slot Index Spatial Join , 2003, IEEE Trans. Knowl. Data Eng..

[49]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[50]  Michael V. Mannino,et al.  Statistical profile estimation in database systems , 1988, CSUR.

[51]  Jack A. Orenstein Spatial query processing in an object-oriented database system , 1986, SIGMOD '86.

[52]  Abraham Silberschatz,et al.  Database System Concepts , 1980 .

[53]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[54]  Timos K. Sellis,et al.  Topological relations in the world of minimum bounding rectangles: a study with R-trees , 1995, SIGMOD '95.

[55]  F. Frances Yao,et al.  Computational Geometry , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[56]  Yannis E. Ioannidis,et al.  Balancing histogram optimality and practicality for query result size estimation , 1995, SIGMOD '95.