Range Queries on Uncertain Data

Given a set \(P\) of \(n\) uncertain points on the real line, each represented by its one-dimensional probability density function, we consider the problem of building data structures on \(P\) to answer range queries of the following three types: (1) top-\(1\) query: find the point in \(P\) that lies in \(I\) with the highest probability, (2) top-\(k\) query: given any integer \(k\le n\) as part of the query, return the \(k\) points in \(P\) that lie in \(I\) with the highest probabilities, and (3) threshold query: given any threshold \(\tau \) as part of the query, return all points of \(P\) that lie in \(I\) with probabilities at least \(\tau \). We present data structures for these range queries with linear or near linear space and efficient query time.

[1]  Pankaj K. Agarwal,et al.  Nearest-neighbor searching under uncertainty , 2012, PODS.

[2]  Donald B. Johnson,et al.  The Complexity of Selection and Ranking in X+Y and Matrices with Sorted Columns , 1982, J. Comput. Syst. Sci..

[3]  Parag Agrawal,et al.  Trio: a system for data, uncertainty, and lineage , 2006, VLDB.

[4]  Sunil Prabhakar,et al.  Threshold query optimization for uncertain data , 2010, SIGMOD Conference.

[5]  Leonidas J. Guibas,et al.  Fractional cascading: I. A data structuring technique , 1986, Algorithmica.

[6]  Micha Sharir,et al.  Red-Blue Intersection Detection Algorithms, with Applications to Motion Planning and Collision Detection , 1990, SIAM J. Comput..

[7]  Chi-Yin Chow,et al.  Probabilistic Verifiers: Evaluating Constrained Nearest-Neighbor Queries over Uncertain Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[8]  Jeffrey Scott Vitter,et al.  Efficient Indexing Methods for Probabilistic Threshold Queries over Uncertain Data , 2004, VLDB.

[9]  Yufei Tao,et al.  Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions , 2005, VLDB.

[10]  Yufei Tao,et al.  Indexing uncertain data , 2009, PODS.

[11]  Bernard Chazelle,et al.  Filtering search: A new approach to query-answering , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[12]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.

[13]  Jian Li,et al.  Ranking continuous probabilistic datasets , 2010, Proc. VLDB Endow..

[14]  Bernard Chazelle,et al.  The power of geometric duality , 1985, BIT Comput. Sci. Sect..

[15]  Jeff M. Phillips,et al.  Range counting coresets for uncertain data , 2013, SoCG '13.

[16]  Leonidas J. Guibas,et al.  Fractional cascading: II. Applications , 1986, Algorithmica.

[17]  Yufei Tao,et al.  Range search on multidimensional uncertain data , 2007, TODS.

[18]  Robert E. Tarjan,et al.  Making data structures persistent , 1986, STOC '86.

[19]  Joseph S. B. Mitchell,et al.  L1 shortest paths among polygonal obstacles in the plane , 1992, Algorithmica.

[20]  Qi Yu,et al.  Efficient range query processing on uncertain data , 2011, 2011 IEEE International Conference on Information Reuse & Integration.

[21]  Yufei Tao,et al.  Efficient Evaluation of Probabilistic Advanced Spatial Queries on Existentially Uncertain Data , 2009, IEEE Transactions on Knowledge and Data Engineering.