Reverse-k-Nearest-Neighbor Join Processing

A reverse k-nearest neighbour (RkNN) query determines the objects from a database that have the query as one of their k-nearest neighbors. Processing such a query has received plenty of attention in research. However, the effect of running multiple RkNN queries at once (join) or within a short time interval (bulk/group query) has only received little attention so far. In this paper, we analyze different types of RkNN joins and discuss possible solutions for solving the non-trivial variants of this problem, including self and mutual pruning strategies. The results indicate that even with a moderate number of query objects (|R|≈0.0007|S|), the performance (CPU) of the state-of-the-art mutual pruning based RkNN-queries deteriorates and hence algorithms based on self pruning without precomputation produce better results. During an extensive performance analysis we provide evaluation results showing the IO and CPU performance of the compared algorithms for a wide range of different setups and suggest appropriate query algorithms for specific scenarios.

[1]  Yufei Tao,et al.  All-nearest-neighbors queries in spatial databases , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[2]  Muhammad Aamir Cheema,et al.  Influence zone: Efficiently processing reverse k nearest neighbors queries , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[3]  Elke Achtert,et al.  Efficient reverse k-nearest neighbor search in arbitrary metric spaces , 2006, SIGMOD Conference.

[4]  Yufei Tao,et al.  Reverse Nearest Neighbor Search in Metric Spaces , 2006, IEEE Transactions on Knowledge and Data Engineering.

[5]  Hans-Peter Kriegel,et al.  Reverse k-Nearest Neighbor Search Based on Aggregate Point Access Methods , 2009, SSDBM.

[6]  King-Ip Lin,et al.  An index structure for efficient reverse nearest neighbor queries , 2001, Proceedings 17th International Conference on Data Engineering.

[7]  Hans-Peter Kriegel,et al.  On the impact of flash SSDs on spatial indexing , 2010, DaMoN '10.

[8]  Amit Singh,et al.  High dimensional reverse nearest neighbor queries , 2003, CIKM '03.

[9]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[10]  Pasi Fränti,et al.  Outlier detection using k-nearest neighbour graph , 2004, ICPR 2004.

[11]  Mong-Li Lee,et al.  ERkNN: efficient reverse k-nearest neighbors retrieval with local kNN-distance estimation , 2005, CIKM '05.

[12]  Ray A. Jarvis,et al.  Clustering Using a Similarity Measure Based on Shared Near Neighbors , 1973, IEEE Transactions on Computers.

[13]  Anthony K. H. Tung,et al.  Ranking Outliers Using Symmetric Neighborhood Relationship , 2006, PAKDD.

[14]  Marianne Winslett,et al.  Scientific and Statistical Database Management, 21st International Conference, SSDBM 2009, New Orleans, LA, USA, June 2-4, 2009, Proceedings , 2009, SSDBM.

[15]  Elke Achtert,et al.  Reverse k-nearest neighbor search in dynamic and general metric databases , 2009, EDBT '09.

[16]  Peer Kröger,et al.  A Mutual Pruning Approach for RkNN Join Processing , 2013, BTW.

[17]  Yufei Tao,et al.  Reverse kNN Search in Arbitrary Dimensionality , 2004, VLDB.

[18]  Chengyang Zhang,et al.  Advances in Spatial and Temporal Databases , 2015, Lecture Notes in Computer Science.

[19]  Panos Kalnis,et al.  Efficient OLAP Operations in Spatial Data Warehouses , 2001, SSTD.

[20]  Hans-Peter Kriegel,et al.  Boosting spatial pruning: on optimal pruning of MBRs , 2010, SIGMOD Conference.

[21]  Wei Wu,et al.  FINCH: evaluating reverse k-Nearest-Neighbor queries on location data , 2008, Proc. VLDB Endow..

[22]  Christian Böhm,et al.  The k-Nearest Neighbour Join: Turbo Charging the KDD Process , 2004, Knowledge and Information Systems.

[23]  Hans-Peter Kriegel,et al.  Inverse Queries for Multidimensional Spaces , 2011, SSTD.

[24]  Elke Achtert,et al.  Spatial Outlier Detection: Data, Algorithms, Visualizations , 2011, SSTD.

[25]  Flip Korn,et al.  Influence sets based on reverse nearest neighbor queries , 2000, SIGMOD 2000.

[26]  Hui Xiong,et al.  High-dimensional kNN joins with incremental updates , 2010, GeoInformatica.

[27]  Divyakant Agrawal,et al.  Reverse Nearest Neighbor Queries for Dynamic Databases , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[28]  Wei Wu,et al.  Continuous Reverse k-Nearest-Neighbor Monitoring , 2008, The Ninth International Conference on Mobile Data Management (mdm 2008).