Interval reverse nearest neighbor queries on uncertain data with Markov correlations

Nowadays, many applications return to the user a set of results that take the query as their nearest neighbor, which are commonly expressed through reverse nearest neighbor (RNN) queries. When considering moving objects, users would like to find objects that appear in the RNN result set for a period of time in some real-world applications such as collaboration recommendation and anti-tracking. In this work, we formally define the problem of interval reverse nearest neighbor (IRNN) queries over moving objects, which return the objects that maintain nearest neighboring relations to the moving query objects for the longest time in the given interval. Location uncertainty of moving data objects and moving query objects is inherent in various domains, and we investigate objects that exhibit Markov correlations, that is, each object's location is only correlated with its own location at previous timestamp while being independent of other objects. There exists the efficiency challenge for answering IRNN queries on uncertain moving objects with Markov correlations since we have to retrieve not only all the possible locations of each object at current time but also its historically possible locations. To speed up the query processing, we present a general framework for answering IRNN queries on uncertain moving objects with Markov correlations in two phases. In the first phase, we apply space pruning and probability pruning techniques, which reduce the search space significantly. In the second phase, we verify whether each unpruned object is an IRNN of the query object. During this phase, we propose an approach termed Probability Decomposition Verification (PDV) algorithm which avoid computing the probability of any object being an RNN of the query object exactly and thus improve the efficiency of verification. The performance of the proposed algorithm is demonstrated by extensive experiments on synthetic and real datasets, and the experimental results show that our algorithm is more efficient than the Monte-Carlo based approximate algorithm.

[1]  Yufei Tao,et al.  Time-parameterized queries in spatio-temporal databases , 2002, SIGMOD '02.

[2]  Elke Achtert,et al.  Efficient reverse k-nearest neighbor search in arbitrary metric spaces , 2006, SIGMOD Conference.

[3]  Xiang Lian,et al.  Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data , 2009, The VLDB Journal.

[4]  Philip S. Yu,et al.  Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor , 2009, Proc. VLDB Endow..

[5]  King-Ip Lin,et al.  An index structure for efficient reverse nearest neighbor queries , 2001, Proceedings 17th International Conference on Data Engineering.

[6]  Ihab F. Ilyas,et al.  Supporting ranking queries on uncertain and incomplete data , 2010, The VLDB Journal.

[7]  Shashi Shekhar,et al.  Continuous Evaluation of Monochromatic and Bichromatic Reverse Nearest Neighbors , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[8]  Hans-Peter Kriegel,et al.  Efficient Probabilistic Reverse Nearest Neighbor Query Processing on Uncertain Data , 2011, Proc. VLDB Endow..

[9]  Elke Achtert,et al.  Reverse k-nearest neighbor search in dynamic and general metric databases , 2009, EDBT '09.

[10]  Muhammad Aamir Cheema,et al.  Lazy Updates: An Efficient Technique to Continuously Monitoring Reverse kNN , 2009, Proc. VLDB Endow..

[11]  Yufei Tao,et al.  Reverse Nearest Neighbor Search in Metric Spaces , 2006, IEEE Transactions on Knowledge and Data Engineering.

[12]  Wei Wu,et al.  FINCH: evaluating reverse k-Nearest-Neighbor queries on location data , 2008, Proc. VLDB Endow..

[13]  King-Ip Lin,et al.  Applying bulk insertion techniques for dynamic reverse nearest neighbor problems , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[14]  Divyakant Agrawal,et al.  Discovery of Influence Sets in Frequently Updated Databases , 2001, VLDB.

[15]  Christopher Ré,et al.  Access Methods for Markovian Streams , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[16]  S. Muthukrishnan,et al.  Influence sets based on reverse nearest neighbor queries , 2000, SIGMOD '00.

[17]  Christian S. Jensen,et al.  Nearest and reverse nearest neighbor queries for moving objects , 2006, The VLDB Journal.

[18]  Yufei Tao,et al.  Reverse kNN Search in Arbitrary Dimensionality , 2004, VLDB.

[19]  Hans-Peter Kriegel,et al.  Querying Uncertain Spatio-Temporal Data , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[20]  Jian Pei,et al.  Probabilistic Reverse Nearest Neighbor Queries on Uncertain Data , 2010, IEEE Transactions on Knowledge and Data Engineering.

[21]  Divyakant Agrawal,et al.  Reverse Nearest Neighbor Queries for Dynamic Databases , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[22]  Wei Wu,et al.  Continuous Reverse k-Nearest-Neighbor Monitoring , 2008, The Ninth International Conference on Mobile Data Management (mdm 2008).

[23]  Xiang Lian,et al.  A Generic Framework for Handling Uncertain Data with Local Correlations , 2010, Proc. VLDB Endow..

[24]  Sunil Prabhakar,et al.  Querying imprecise data in moving object environments , 2003, IEEE Transactions on Knowledge and Data Engineering.

[25]  Christopher Ré,et al.  Event queries on correlated probabilistic streams , 2008, SIGMOD Conference.

[26]  Peter J. Haas,et al.  MCDB: a monte carlo approach to managing uncertain data , 2008, SIGMOD Conference.

[27]  Tian Xia,et al.  Continuous Reverse Nearest Neighbor Monitoring , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[28]  Christian S. Jensen,et al.  Indexing the positions of continuously moving objects , 2000, SIGMOD '00.