Query Processing over Uncertain Databases

Due to measurement errors, transmission lost, or injected noise for privacy protection, uncertainty exists in the data of many real applications. However, query processing techniques for deterministic data cannot be directly applied to uncertain data because they do not have mechanisms to handle data uncertainty. Therefore, efficient and effective manipulation of uncertain data is a practical yet challenging research topic. In this book, we start from the data models for imprecise and uncertain data, move on to defining different semantics for queries on uncertain data, and finally discuss the advanced query processing techniques for various probabilistic queries in uncertain databases. The book serves as a comprehensive guideline for query processing over uncertain databases. Table of Contents: Introduction / Uncertain Data Models / Spatial Query Semantics over Uncertain Data Models / Spatial Query Processing over Uncertain Databases / Conclusion

[1]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[2]  Heng Tao Shen,et al.  Multi-source Skyline Query Processing in Road Networks , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[3]  Yufei Tao,et al.  Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions , 2005, VLDB.

[4]  Xiang Lian,et al.  Probabilistic ranked queries in uncertain databases , 2008, EDBT '08.

[5]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.

[6]  Jimeng Sun,et al.  Hiding in the Crowd: Privacy Preservation on Evolving Streams through Correlation Tracking , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[7]  Victor Maojo,et al.  Medical Data Analysis , 2001, Lecture Notes in Computer Science.

[8]  Chi-Yin Chow,et al.  A peer-to-peer spatial cloaking algorithm for anonymous location-based service , 2006, GIS '06.

[9]  Kevin Chen-Chuan Chang,et al.  Supporting ranking and clustering as generalized order-by and group-by , 2007, SIGMOD '07.

[10]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[11]  Bin Jiang,et al.  Probabilistic Skylines on Uncertain Data , 2007, VLDB.

[12]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[13]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[14]  Kyriakos Mouratidis,et al.  Aggregate nearest neighbor queries in spatial databases , 2005, TODS.

[15]  Xiang Lian,et al.  Probabilistic Inverse Ranking Queries over Uncertain Data , 2009, DASFAA.

[16]  Xiang Lian,et al.  Monochromatic and bichromatic reverse skyline search over uncertain databases , 2008, SIGMOD Conference.

[17]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[18]  Dieter Pfoser,et al.  Capturing the Uncertainty of Moving-Object Representations , 1999, SSD.

[19]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[20]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[21]  Yufei Tao,et al.  Reverse kNN Search in Arbitrary Dimensionality , 2004, VLDB.

[22]  Xiang Lian,et al.  Probabilistic Group Nearest Neighbor Queries in Uncertain Databases , 2008, IEEE Transactions on Knowledge and Data Engineering.

[23]  Ambuj K. Singh,et al.  APLA: Indexing Arbitrary Probability Distributions , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[24]  Jeffrey Xu Yu,et al.  Spatial Range Querying for Gaussian-Based Imprecise Query Objects , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[25]  Hans-Peter Kriegel,et al.  Probabilistic Nearest-Neighbor Query on Uncertain Objects , 2007, DASFAA.

[26]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[27]  Christos Faloutsos,et al.  Fast Nearest Neighbor Search in Medical Image Databases , 1996, VLDB.

[28]  Xiang Lian,et al.  Pattern Matching over Cloaked Time Series , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[29]  Philippe Bonnet,et al.  GADT: a probability space ADT for representing and querying the physical world , 2002, Proceedings 18th International Conference on Data Engineering.

[30]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[31]  Yufei Tao,et al.  Reverse nearest neighbors in large graphs , 2005, 21st International Conference on Data Engineering (ICDE'05).

[32]  Ihab F. Ilyas,et al.  Efficient search for the top-k probable nearest neighbors in uncertain databases , 2008, Proc. VLDB Endow..

[33]  Jennifer Widom,et al.  Working Models for Uncertain Data , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[34]  Xiang Lian,et al.  Top-k dominating queries in uncertain databases , 2009, EDBT '09.

[35]  Jochen Schiller,et al.  Location Based Services , 2004 .

[36]  Christopher Ré,et al.  Efficient Top-k Query Evaluation on Probabilistic Data , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[37]  Reynold Cheng,et al.  Efficient Evaluation of Imprecise Location-Dependent Queries , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[38]  Elke Achtert,et al.  Efficient reverse k-nearest neighbor search in arbitrary metric spaces , 2006, SIGMOD Conference.

[39]  Xiang Lian,et al.  Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data , 2009, The VLDB Journal.

[40]  Flip Korn,et al.  Influence sets based on reverse nearest neighbor queries , 2000, SIGMOD 2000.

[41]  Daisy Zhe Wang,et al.  Querying probabilistic information extraction , 2010, Proc. VLDB Endow..

[42]  Mohamed A. Soliman,et al.  Top-k Query Processing in Uncertain Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[43]  Jian Pei,et al.  Ranking queries on uncertain data: a probabilistic threshold approach , 2008, SIGMOD Conference.

[44]  Divesh Srivastava,et al.  Reverse Nearest Neighbor Aggregates Over Data Streams , 2002, VLDB.

[45]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[46]  Ambuj K. Singh,et al.  Top-k Spatial Joins of Probabilistic Objects , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[47]  Christian Böhm,et al.  The Gauss-Tree: Efficient Object Identification in Databases of Probabilistic Feature Vectors , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[48]  Sunil Prabhakar,et al.  Querying imprecise data in moving object environments , 2003, IEEE Transactions on Knowledge and Data Engineering.

[49]  Divyakant Agrawal,et al.  Reverse Nearest Neighbor Queries for Dynamic Databases , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[50]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[51]  Hans-Peter Kriegel,et al.  Probabilistic Similarity Join on Uncertain Data , 2006, DASFAA.

[52]  Feifei Li,et al.  Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations , 2008, IEEE Trans. Knowl. Data Eng..

[53]  Feifei Li,et al.  Semantics of Ranking Queries for Probabilistic Data and Expected Ranks , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[54]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[55]  Man Lung Yiu,et al.  Reverse Nearest Neighbors Search in Ad Hoc Subspaces , 2007, IEEE Trans. Knowl. Data Eng..

[56]  Felix Naumann,et al.  Data fusion , 2009, CSUR.

[57]  Jeffrey Scott Vitter,et al.  Efficient Indexing Methods for Probabilistic Threshold Queries over Uncertain Data , 2004, VLDB.

[58]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[59]  Bernhard Seeger,et al.  Efficient Computation of Reverse Skyline Queries , 2007, VLDB.

[60]  Chi-Yin Chow,et al.  Probabilistic Verifiers: Evaluating Constrained Nearest-Neighbor Queries over Uncertain Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[61]  Yunhao Liu,et al.  Mining Frequent Trajectory Patterns for Activity Monitoring Using Radio Frequency Tag Arrays , 2007, Fifth Annual IEEE International Conference on Pervasive Computing and Communications (PerCom'07).

[62]  Divesh Srivastava,et al.  Integrating Conflicting Data: The Role of Source Dependence , 2009, Proc. VLDB Endow..

[63]  Shashi Shekhar,et al.  Continuous Evaluation of Monochromatic and Bichromatic Reverse Nearest Neighbors , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[64]  Philip S. Yu,et al.  Time Series Compressibility and Privacy , 2007, VLDB.

[65]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[66]  Xiang Lian,et al.  Probabilistic Time Consistent Queries over Moving Objects , 2011, SSDBM.

[67]  Philip S. Yu,et al.  On High Dimensional Indexing of Uncertain Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[68]  Amit Singh,et al.  High dimensional reverse nearest neighbor queries , 2003, CIKM '03.

[69]  Susanne E. Hambrusch,et al.  Database Support for Probabilistic Attributes and Tuples , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[70]  Walid G. Aref,et al.  Casper*: Query processing for location services without compromising privacy , 2006, TODS.

[71]  Sunil Prabhakar,et al.  Evaluating probabilistic queries over imprecise data , 2003, SIGMOD '03.

[72]  Feifei Li,et al.  Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations , 2008, IEEE Transactions on Knowledge and Data Engineering.

[73]  Ivan Stojmenovic,et al.  Sensor Networks , 2005 .

[74]  Yunhao Liu,et al.  Underground coal mine monitoring with wireless sensor networks , 2009, TOSN.

[75]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.

[76]  Ling Liu,et al.  Location Privacy in Mobile Systems: A Personalized Anonymization Model , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[77]  King-Ip Lin,et al.  An index structure for efficient reverse nearest neighbor queries , 2001, Proceedings 17th International Conference on Data Engineering.

[78]  Yannis Manolopoulos,et al.  Performance of Nearest Neighbor Queries in R-Trees , 1997, ICDT.

[79]  Kyriakos Mouratidis,et al.  Group nearest neighbor queries , 2004, Proceedings. 20th International Conference on Data Engineering.