Query Processing Using Distance Oracles for Spatial Networks

The popularity of location-based services and the need to do real-time processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of spatial operations usually involves the computation of distance along a spatial network instead of "as the crow flies," which is not simple. Techniques are described that enable the determination of the network distance between any pair of points (i.e., vertices) with as little as O(n) space rather than having to store the n2 distances between all pairs. This is done by being willing to expend a bit more time to achieve this goal such as O(log n) instead of O(1), as well as by accepting an error ε in the accuracy of the distance that is provided. The strategy that is adopted reduces the space requirements and is based on the ability to identify groups of source and destination vertices for which the distance is approximately the same within some ε. The reductions are achieved by introducing a construct termed a distance oracle that yields an estimate of the network distance (termed the ε-approximate distance) between any two vertices in the spatial network. The distance oracle is obtained by showing how to adapt the well-separated pair technique from computational geometry to spatial networks. Initially, an e-approximate distance oracle of size O(n/(ε2)) is used that is capable of retrieving the approximate network distance in O(log n) time using a B-tree. The retrieval time can be theoretically reduced further to O(1) time by proposing another e-approximate distance oracle of size O((n log n)/(ε2)) that uses a hash table. Experimental results indicate that the proposed technique is scalable and can be applied to sufficiently large road networks. For example, a 10-percentapproximate oracle (ε = 0.1) on a large network yielded an average error of 0.9 percent with 90 percent of the answers having an error of 2 percent or less and an average retrieval time of 68 μ seconds. The fact that the network distance can be approximated by one value is used to show how a number of spatial queries can be formulated using appropriate SQL constructs and a few built-in primitives. The result is that these operations can be executed on almost any modern database with no modifications, while taking advantage of the existing query optimizers and query processing strategies.

[1]  S. Rao Kosaraju,et al.  A decomposition of multidimensional point sets with applications to k-nearest-neighbors and n-body potential fields , 1995, JACM.

[2]  Sariel Har-Peled Geometric Approximation Algorithms , 2011 .

[3]  Giri Narasimhan,et al.  Approximating the Stretch Factor of Euclidean Graphs , 2000, SIAM J. Comput..

[4]  Surajit Chaudhuri,et al.  An overview of query optimization in relational systems , 1998, PODS.

[5]  Hanan Samet,et al.  Enabling Query Processing on Spatial Networks , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[6]  Hanan Samet,et al.  Scalable network distance browsing in spatial databases , 2008, SIGMOD Conference.

[7]  Hanan Samet,et al.  Ranking in Spatial Databases , 1995, SSD.

[8]  Irene Gargantini,et al.  An effective way to represent quadtrees , 1982, CACM.

[9]  Hans-Peter Kriegel,et al.  Hierarchical Graph Embedding for Efficient Query Processing in Very Large Traffic Networks , 2008, SSDBM.

[10]  Jack A. Orenstein Multidimensional Tries Used for Associative Searching , 1982, Inf. Process. Lett..

[11]  Timothy M. Chan Well-separated pair decomposition in linear time? , 2008, Inf. Process. Lett..

[12]  Hanan Samet,et al.  Distance join queries on spatial networks , 2006, GIS '06.

[13]  Cyrus Shahabi,et al.  A Road Network Embedding Technique for K-Nearest Neighbor Search in Moving Object Databases , 2002, GIS '02.

[14]  Jie Gao,et al.  Well-separated pair decomposition for the unit-disk graph metric and its applications , 2003, STOC '03.

[15]  Hanan Samet,et al.  Path Oracles for Spatial Networks , 2009, Proc. VLDB Endow..

[16]  Sakti Pramanik,et al.  An Efficient Path Computation Model for Hierarchically Structured Topographical Road Maps , 2002, IEEE Trans. Knowl. Data Eng..

[17]  S. Rao Kosaraju,et al.  Faster algorithms for some geometric graph problems in higher dimensions , 1993, SODA '93.

[18]  Joachim Gudmundsson,et al.  Approximate distance oracles for geometric graphs , 2002, SODA '02.

[19]  Hanan Samet,et al.  Use of the SAND spatial browser for digital government applications , 2003, CACM.

[20]  Hanan Samet,et al.  Distance Oracles for Spatial Networks , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[21]  Chin-Wan Chung,et al.  An Efficient and Scalable Approach to CNN Queries in a Road Network , 2005, VLDB.

[22]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[23]  Sariel Har-Peled,et al.  Dynamic Well-Separated Pair Decomposition Made Easy , 2005, CCCG.

[24]  Mikkel Thorup,et al.  Approximate distance oracles , 2001, JACM.

[25]  Peter Sanders,et al.  In Transit to Constant Time Shortest-Path Queries in Road Networks , 2007, ALENEX.

[26]  Jon Louis Bentley,et al.  Decomposable Searching Problems , 1979, Inf. Process. Lett..

[27]  Sukho Lee,et al.  Adaptive multi-stage distance join processing , 2000, SIGMOD '00.

[28]  Hanan Samet,et al.  Efficient query processing on spatial networks , 2005, GIS '05.

[29]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.

[30]  Kenneth Steiglitz,et al.  Operations on Images Using Quad Trees , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Elke A. Rundensteiner,et al.  Hierarchical Encoded Path Views for Path Query Processing: An Optimal Model and Its Performance Evaluation , 1998, IEEE Trans. Knowl. Data Eng..

[32]  Ann Q. Gates,et al.  TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING , 2005 .

[33]  Dorothea Wagner,et al.  Geometric Speed-Up Techniques for Finding Shortest Paths in Large Sparse Graphs , 2003, ESA.

[34]  Hanan Samet,et al.  Incremental distance join algorithms for spatial databases , 1998, SIGMOD '98.

[35]  Michael J. Carey,et al.  On saying “Enough already!” in SQL , 1997, SIGMOD '97.

[36]  Andrew V. Goldberg,et al.  Computing Point-to-Point Shortest Paths from External Memory , 2005, ALENEX/ANALCO.

[37]  Yufei Tao,et al.  Query Processing in Spatial Network Databases , 2003, VLDB.