On Computing Top-t Most Influential Spatial Sites

Given a set O of weighted objects, a set S of sites, and a query site s, the bichromatic RNN query computes the influence set of s, or the set of objects in O that consider s as the nearest site among all sites in S. The influence of a site s can be defined as the total weight of its RNNs. This paper addresses the new and interesting problem of finding the top-t most influential sites from S, inside a given spatial region Q. A straightforward approach is to find the sites in Q, and compute the RNNs of every such site. This approach is not efficient for two reasons. First, all sites in Q need to be identified whatsoever, and the number may be large. Second, both the site R-tree and the object R-tree need to be browsed a large number of times. For each site in Q, the R-tree of sites is browsed to identify the influence region -- a polygonal region that may contain RNNs, and then the R-tree of objects is browsed to find the RNN set. This paper proposes an algorithm called TopInfluential-Sites, which finds the top-t most influential sites by browsing both trees once systematically. Novel pruning techniques are provided, based on a new metric called minExistDNN. There is no need to compute the influence for all sites in Q, or even to visit all sites in Q. Experimental results verify that our proposed method outperforms the straightforward approach.

[1]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[2]  Jan Vahrenhold,et al.  Reverse Nearest Neighbor Queries , 2002, Encyclopedia of GIS.

[3]  Yufei Tao,et al.  Reverse nearest neighbors in large graphs , 2005, 21st International Conference on Data Engineering (ICDE'05).

[4]  Panos Kalnis,et al.  Efficient OLAP Operations in Spatial Data Warehouses , 2001, SSTD.

[5]  Divyakant Agrawal,et al.  Reverse Nearest Neighbor Queries for Dynamic Databases , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[6]  Divesh Srivastava,et al.  Reverse Nearest Neighbor Aggregates Over Data Streams , 2002, VLDB.

[7]  S. Muthukrishnan,et al.  Influence sets based on reverse nearest neighbor queries , 2000, SIGMOD '00.

[8]  Michiel Smid,et al.  Closest-Point Problems in Computational Geometry , 2000, Handbook of Computational Geometry.

[9]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[10]  Yufei Tao,et al.  Reverse kNN Search in Arbitrary Dimensionality , 2004, VLDB.

[11]  Amit Singh,et al.  High dimensional reverse nearest neighbor queries , 2003, CIKM '03.

[12]  Christian S. Jensen,et al.  Nearest neighbor and reverse nearest neighbor queries for moving objects , 2002, Proceedings International Database Engineering and Applications Symposium.

[13]  Divyakant Agrawal,et al.  Discovery of Influence Sets in Frequently Updated Databases , 2001, VLDB.

[14]  King-Ip Lin,et al.  An index structure for efficient reverse nearest neighbor queries , 2001, Proceedings 17th International Conference on Data Engineering.

[15]  Divyakant Agrawal,et al.  Constrained Nearest Neighbor Queries , 2001, Encyclopedia of GIS.

[16]  King-Ip Lin,et al.  Applying bulk insertion techniques for dynamic reverse nearest neighbor problems , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..