Spatial joins: what's next?

The spatial join is a popular operation in spatial database systems and its evaluation is a well-studied problem. This paper reviews research and recent trends on spatial join evaluation. The complexity of different data types, the consideration of different join predicates, the use of modern commodity hardware, and support for parallel processing open the road to a number of interesting directions for future research, some of which we outline in the paper.

[1]  Mohamed F. Mokbel,et al.  On Spatial Joins in MapReduce , 2017, SIGSPATIAL/GIS.

[2]  Thomas Heinis,et al.  TOUCH: in-memory spatial join by hierarchical data-oriented partitioning , 2013, SIGMOD '13.

[3]  David J. DeWitt,et al.  Partition based spatial-merge join , 1996, SIGMOD '96.

[4]  M. Goodchild,et al.  Geographic Information Systems and Science (second edition) , 2001 .

[5]  Walid G. Aref,et al.  LocationSpark: A Distributed In-Memory Data Management System for Big Spatial Data , 2016, Proc. VLDB Endow..

[6]  Andreas Kipf,et al.  Adaptive Geospatial Joins for Modern Hardware , 2018, ArXiv.

[7]  Biplob Kumar Debnath Spatial Join , 2008, Encyclopedia of GIS.

[8]  Jia Yu,et al.  Spatial data management in apache spark: the GeoSpark perspective and beyond , 2018, GeoInformatica.

[9]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[10]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[11]  Shashi Shekhar,et al.  A vision for GPU-accelerated parallel computation on geo-spatial datasets , 2015, SIGSPACIAL.

[12]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[13]  Thomas Seidl,et al.  MR-DSJ: Distance-Based Self-Join for Large-Scale Vector Data Analysis with MapReduce , 2013, BTW.

[14]  Bernhard Seeger,et al.  Data redundancy and duplicate detection in spatial join processing , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[15]  Xiaofang Zhou,et al.  Data Partitioning for Parallel Spatial Join Processing , 1997, GeoInformatica.

[16]  Yufei Tao,et al.  All-nearest-neighbors queries in spatial databases , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[17]  M. Goodchild,et al.  Geographic Information Systems and Science (second edition) , 2005 .

[18]  Ralf Hartmut Güting Dr.rer.nat An introduction to spatial database systems , 2005, The VLDB Journal.

[19]  Joel H. Saltz,et al.  Towards building a high performance spatial query system for large scale medical imaging data , 2012, SIGSPATIAL/GIS.

[20]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[21]  Nikos Mamoulis,et al.  Spatio-textual similarity joins , 2012, Proc. VLDB Endow..

[22]  Jia Yu,et al.  Geospatial Data Management in Apache Spark: A Tutorial , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[23]  Kevin Anderson,et al.  Multi-core parallelism for plane sweep algorithms as a foundation for GIS operations , 2016, GeoInformatica.

[24]  Nikos Mamoulis,et al.  Spatial Data Management , 2011, Synthesis Lectures on Data Management.

[25]  Jano Moreira de Souza,et al.  A Raster Approximation For Processing of Spatial Joins , 1998, VLDB.

[26]  Joel H. Saltz,et al.  Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce , 2013, Proc. VLDB Endow..

[27]  Hans-Peter Kriegel,et al.  Parallel processing of spatial joins using R-trees , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[28]  Nick Koudas,et al.  Size separation spatial join , 1997, SIGMOD '97.

[29]  Zhiyong Xu,et al.  SJMR: Parallelizing spatial join with MapReduce on clusters , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[30]  Thomas Heinis,et al.  Configuring Spatial Grids for Efficient Main Memory Joins , 2015, BICOD.

[31]  Anastasia Ailamaki,et al.  GPU Rasterization for Real-Time Spatial Aggregation over Arbitrary Polygons , 2017, Proc. VLDB Endow..

[32]  Chris Fleizach Multi-Step Processing of Spatial Joins , 2016 .

[33]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.

[34]  Sushil K. Prasad,et al.  GCMF: an efficient end-to-end spatial join system over large polygonal datasets on GPGPU platform , 2016, SIGSPATIAL/GIS.

[35]  Nikos Mamoulis,et al.  An Effective Encoding Scheme for Spatial RDF Data , 2014, Proc. VLDB Endow..

[36]  Nikos Mamoulis,et al.  Discovery of Periodic Patterns in Spatiotemporal Sequences , 2007, IEEE Transactions on Knowledge and Data Engineering.

[37]  Hanan Samet,et al.  Spatial join techniques , 2007, TODS.

[38]  Ahmed Eldawy,et al.  The era of big spatial data , 2015, 2015 31st IEEE International Conference on Data Engineering Workshops.

[39]  Johannes Gehrke,et al.  An Experimental Analysis of Iterated Spatial Joins in Main Memory , 2013, Proc. VLDB Endow..

[40]  Ahmed Eldawy,et al.  SpatialHadoop: A MapReduce framework for spatial data , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[41]  Sven Helmer,et al.  An interval join optimized for modern hardware , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[42]  Thomas Heinis,et al.  GIPSY: joining spatial datasets with contrasting density , 2013, SSDBM.

[43]  Nikos Mamoulis,et al.  Efficient Top-k Spatial Distance Joins , 2013, SSTD.

[44]  Thomas Heinis,et al.  TRANSFORMERS: Robust spatial joins on non-uniform data distributions , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[45]  Mario A. López,et al.  STR: a simple and efficient algorithm for R-tree packing , 1997, Proceedings 13th International Conference on Data Engineering.

[46]  Minyi Guo,et al.  Simba: spatial in-memory big data analysis , 2016, SIGSPATIAL/GIS.

[47]  Le Gruenwald,et al.  Large-scale spatial join query processing in Cloud , 2015, 2015 31st IEEE International Conference on Data Engineering Workshops.

[48]  Karlheinz Meier,et al.  Introducing the Human Brain Project , 2011, FET.

[49]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[50]  Christian S. Jensen,et al.  In-Memory Spatial Join: The Data Matters! , 2017, EDBT.

[51]  Sridhar Ramaswamy,et al.  Scalable Sweeping-Based Spatial Join , 1998, VLDB.

[52]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[53]  Ryan Johnson,et al.  Skew-resistant parallel in-memory spatial join , 2014, SSDBM '14.

[54]  Andreas Kipf,et al.  How Good Are Modern Spatial Analytics Systems? , 2018, Proc. VLDB Endow..

[55]  Ming-Ling Lo,et al.  Spatial hash-joins , 1996, SIGMOD '96.