Managing Uncertain Data: Probabilistic Approaches

Uncertain data are inherent in many important applications. Recently, considerable research efforts have been put into the field of managing uncertain data. In this paper, we summarize existing techniques to query and model uncertain data and systems that effectively manage uncertain data, mainly from a probabilistic point of view.

[1]  José Galindo,et al.  Fuzzy Databases: Modeling, Design, and Implementation , 2006 .

[2]  Sunil Prabhakar,et al.  Managing Uncertainty in Sensor Databases , 2003 .

[3]  Jeffrey Scott Vitter,et al.  Efficient Indexing Methods for Probabilistic Threshold Queries over Uncertain Data , 2004, VLDB.

[4]  V. S. Subrahmanian,et al.  Probabilistic interval XML , 2003, TOCL.

[5]  Yehoshua Sagiv,et al.  Matching Twigs in Probabilistic XML , 2007, VLDB.

[6]  Philip S. Yu,et al.  A Survey of Uncertain Data Algorithms and Applications , 2009, IEEE Transactions on Knowledge and Data Engineering.

[7]  Shashi Shekhar,et al.  An Evidential Reasoning Approach to Attribute Value Conflict Resolution in Database Integration , 1996, IEEE Trans. Knowl. Data Eng..

[8]  Susanne E. Hambrusch,et al.  Indexing Uncertain Categorical Data , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[9]  Jennifer Widom,et al.  Trio: A System for Integrated Management of Data, Accuracy, and Lineage , 2004, CIDR.

[10]  Jian Pei,et al.  Ranking queries on uncertain data: a probabilistic threshold approach , 2008, SIGMOD Conference.

[11]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[12]  Bin Jiang,et al.  Probabilistic Skylines on Uncertain Data , 2007, VLDB.

[13]  Sally I. McClean,et al.  Database aggregation of imprecise and uncertain evidence , 2003, Inf. Sci..

[14]  Yufei Tao,et al.  Range search on multidimensional uncertain data , 2007, TODS.

[15]  Reynold Cheng,et al.  Uncertain Data Mining: An Example in Clustering Location Data , 2006, PAKDD.

[16]  H. V. Jagadish,et al.  ProTDB: Probabilistic Data in XML , 2002, VLDB.

[17]  Charu C. Aggarwal On Unifying Privacy and Uncertain Data Models , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[18]  Sunil Prabhakar,et al.  Managing uncertainty in sensor database , 2003, SGMD.

[19]  Xiang Lian,et al.  Monochromatic and bichromatic reverse skyline search over uncertain databases , 2008, SIGMOD Conference.

[20]  Jian Pei,et al.  Efficiently Answering Probabilistic Threshold Top-k Queries on Uncertain Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[21]  Hans-Peter Kriegel,et al.  Density-based clustering of uncertain data , 2005, KDD '05.

[22]  Francesco Bonchi,et al.  Never Walk Alone: Uncertainty for Anonymity in Moving Objects Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[23]  Norbert Fuhr,et al.  A Probabilistic NF2 Relational Algebra for Imprecision in Databases , 1997 .

[24]  Graham Cormode,et al.  Sketching probabilistic data streams , 2007, SIGMOD '07.

[25]  Jennifer Widom,et al.  Working Models for Uncertain Data , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[26]  Sunil Prabhakar,et al.  Querying imprecise data in moving object environments , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[27]  T. S. Jayram,et al.  OLAP over uncertain and imprecise data , 2007, The VLDB Journal.

[28]  Christian Böhm,et al.  The Gauss-Tree: Efficient Object Identification in Databases of Probabilistic Feature Vectors , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[29]  Feifei Li,et al.  Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations , 2008, IEEE Transactions on Knowledge and Data Engineering.

[30]  Sunil Prabhakar,et al.  Evaluating probabilistic queries over imprecise data , 2003, SIGMOD '03.

[31]  Dan Suciu,et al.  Management of probabilistic data: foundations and challenges , 2007, PODS '07.

[32]  Christopher Ré,et al.  MYSTIQ: a system for finding more answers by using probabilities , 2005, SIGMOD '05.

[33]  Dan Olteanu,et al.  10106 Worlds and Beyond: Efficient Representation and Processing of Incomplete Information , 2007, ICDE.

[34]  Yufei Tao,et al.  Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions , 2005, VLDB.

[35]  Philip S. Yu,et al.  On High Dimensional Indexing of Uncertain Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[36]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.

[37]  T. S. Jayram,et al.  Efficient aggregation algorithms for probabilistic data , 2007, SODA '07.

[38]  V. S. Subrahmanian,et al.  PXML: a probabilistic semistructured data model and algebra , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[39]  Kevin Chen-Chuan Chang,et al.  URank: formulation and efficient evaluation of top-k queries in uncertain databases , 2007, SIGMOD '07.

[40]  Sally I. McClean,et al.  Aggregation of Imprecise and Uncertain Information in Databases , 2001, IEEE Trans. Knowl. Data Eng..

[41]  Norbert Fuhr,et al.  A Probabilistic Framework for Vague Queries and Imprecise Information in Databases , 1990, VLDB.

[42]  Serge Abiteboul,et al.  On the Representation and Querying of Sets of Possible Worlds , 1991, Theor. Comput. Sci..

[43]  Serge Abiteboul,et al.  On the complexity of managing probabilistic XML data , 2007, PODS '07.

[44]  Reynold Cheng,et al.  Efficient Clustering of Uncertain Data , 2006, Sixth International Conference on Data Mining (ICDM'06).

[45]  Parag Agrawal,et al.  Trio: a system for data, uncertainty, and lineage , 2006, VLDB.

[46]  Dan Olteanu,et al.  Efficient Representation and Processing of Incomplete Information , 2006 .

[47]  Jennifer Widom,et al.  An Introduction to ULDBs and the Trio System , 2006, IEEE Data Eng. Bull..

[48]  Amihai Motro,et al.  VAGUE: a user interface to relational databases that permits vague queries , 1988, TOIS.

[49]  Jeffrey Scott Vitter,et al.  Efficient join processing over uncertain data , 2006, CIKM '06.

[50]  Christian Böhm,et al.  ProVeR: Probabilistic Video Retrieval using the Gauss-Tree , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[51]  Hans-Peter Kriegel,et al.  Probabilistic Similarity Join on Uncertain Data , 2006, DASFAA.

[52]  Dan Suciu,et al.  The dichotomy of conjunctive queries on probabilistic structures , 2006, PODS '07.

[53]  Hans-Peter Kriegel,et al.  Probabilistic Nearest-Neighbor Query on Uncertain Objects , 2007, DASFAA.

[54]  Dan Olteanu,et al.  $${10^{(10^{6})}}$$ worlds and beyond: efficient representation and processing of incomplete information , 2006, 2007 IEEE 23rd International Conference on Data Engineering.

[55]  Robert B. Ross,et al.  Aggregate operators in probabilistic databases , 2005, JACM.

[56]  Christopher Ré,et al.  Efficient Top-k Query Evaluation on Probabilistic Data , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[57]  Prithviraj Sen,et al.  Representing and Querying Correlated Tuples in Probabilistic Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[58]  Sumit Sarkar,et al.  A probabilistic relational model and algebra , 1996, TODS.

[59]  Elke A. Rundensteiner,et al.  Evaluating aggregates in possibilistic relational databases , 1992, Data Knowl. Eng..

[60]  Arbee L. P. Chen,et al.  Evaluating Aggregate Operations Over Imprecise Data , 1996, IEEE Trans. Knowl. Data Eng..

[61]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[62]  Reynold Cheng,et al.  Sensor node selection for execution of continuous probabilistic queries in wireless sensor networks , 2004, VSSN '04.

[63]  Dan Suciu,et al.  Answering Queries from Statistics and Probabilistic Views , 2005, VLDB.

[64]  Christian Böhm,et al.  Probabilistic Ranking Queries on Gaussians , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).

[65]  Suk Kyoon Lee,et al.  Imprecise and uncertain information in databases: an evidential approach , 1992, [1992] Eighth International Conference on Data Engineering.

[66]  Mohamed A. Soliman,et al.  Top-k Query Processing in Uncertain Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[67]  Parag Agrawal,et al.  Trio-One: Layering Uncertainty and Lineage on a Conventional DBMS (Demo) , 2007, CIDR.

[68]  Chi-Yin Chow,et al.  Probabilistic Verifiers: Evaluating Constrained Nearest-Neighbor Queries over Uncertain Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[69]  Serge Abiteboul,et al.  Querying and Updating Probabilistic Information in XML , 2006, EDBT.

[70]  Jennifer Widom,et al.  ULDBs: databases with uncertainty and lineage , 2006, VLDB.

[71]  Jennifer Widom,et al.  Exploiting Lineage for Confidence Computation in Uncertain and Probabilistic Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[72]  Laks V. S. Lakshmanan,et al.  ProbView: a flexible probabilistic database system , 1997, TODS.

[73]  Jian Pei,et al.  Query answering techniques on uncertain and probabilistic data: tutorial summary , 2008, SIGMOD Conference.

[74]  Hector Garcia-Molina,et al.  The Management of Probabilistic Data , 1992, IEEE Trans. Knowl. Data Eng..