Processing Probabilistic Range Queries over Gaussian-Based Uncertain Data

Probabilistic range query is an important type of query in the area of uncertain data management. A probabilistic range query returns all the objects within a specific range from the query object with a probability no less than a given threshold. In this paper we assume that each uncertain object stored in the databases is associated with a multi-dimensional Gaussian distribution, which describes the probability distribution that the object appears in the multi-dimensional space. A query object is either a certain object or an uncertain object modeled by a Gaussian distribution. We propose several filtering techniques and an R-tree-based index to efficiently support probabilistic range queries over Gaussian objects. Extensive experiments on real data demonstrate the efficiency of our proposed approach.

[1]  Jimeng Sun,et al.  The TPR*-Tree: An Optimized Spatio-Temporal Access Method for Predictive Queries , 2003, VLDB.

[2]  Christian S. Jensen,et al.  Indexing the positions of continuously moving objects , 2000, SIGMOD '00.

[3]  A. Prasad Sistla,et al.  Updating and Querying Databases that Track Mobile Units , 1999, Distributed and Parallel Databases.

[4]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[5]  Sunil Prabhakar,et al.  Querying imprecise data in moving object environments , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[6]  Dieter Pfoser,et al.  Capturing the Uncertainty of Moving-Object Representations , 1999, SSD.

[7]  Yannis Manolopoulos,et al.  R-Trees: Theory and Applications , 2005, Advanced Information and Knowledge Processing.

[8]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[9]  Reynold Cheng,et al.  Efficient Evaluation of Imprecise Location-Dependent Queries , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  Xiang Lian,et al.  A Generic Framework for Handling Uncertain Data with Local Correlations , 2010, Proc. VLDB Endow..

[11]  Yufei Tao,et al.  Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions , 2005, VLDB.

[12]  Max J. Egenhofer,et al.  Advances in Spatial Databases , 1997, Lecture Notes in Computer Science.

[13]  David G. Stork,et al.  Pattern Classification , 1973 .

[14]  Pankaj K. Agarwal,et al.  Range searching on uncertain data , 2012, TALG.

[15]  Xuemin Lin,et al.  Effectively indexing the multi-dimensional uncertain objects for range searching , 2012, EDBT '12.

[16]  Marios Hadjieleftheriou,et al.  SaIL: A Spatial Index Library for Efficient Application Integration , 2005, GeoInformatica.

[17]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[18]  Jeffrey Xu Yu,et al.  Spatial Range Querying for Gaussian-Based Imprecise Query Objects , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[19]  Christian Böhm,et al.  The Gauss-Tree: Efficient Object Identification in Databases of Probabilistic Feature Vectors , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[20]  Kai Zheng,et al.  Spatial query processing for fuzzy objects , 2012, The VLDB Journal.

[21]  Klaus H. Hinrichs,et al.  Managing uncertainty in moving objects databases , 2004, TODS.

[22]  Yufei Tao,et al.  Range search on multidimensional uncertain data , 2007, TODS.

[23]  Sudipto Guha,et al.  Adaptive Uncertainty Resolution in Bayesian Combinatorial Optimization Problems , 2008, ACM Trans. Algorithms.

[24]  Wolfram Burgard,et al.  Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[25]  Sunil Prabhakar,et al.  Evaluating probabilistic queries over imprecise data , 2003, SIGMOD '03.

[26]  William H. Press,et al.  Numerical Recipes 3rd Edition: The Art of Scientific Computing , 2007 .

[27]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.