Working Models for Uncertain Data

This paper explores an inherent tension in modeling and querying uncertain data: simple, intuitive representations of uncertain data capture many application requirements, but these representations are generally incomplete―standard operations over the data may result in unrepresentable types of uncertainty. Complete models are theoretically attractive, but they can be nonintuitive and more complex than necessary for many applications. To address this tension, we propose a two-layer approach to managing uncertain data: an underlying logical model that is complete, and one or more working models that are easier to understand, visualize, and query, but may lose some information. We explore the space of incomplete working models, place several of them in a strict hierarchy based on expressive power, and study their closure properties. We describe how the two-layer approach is being used in our prototype DBMS for uncertain data, and we identify a number of interesting open problems to fully realize the approach.

[1]  Moshe Y. Vardi Querying logical databases , 1985, J. Comput. Syst. Sci..

[2]  Bart Selman,et al.  Knowledge compilation and theory approximation , 1996, JACM.

[3]  E. F. Codd,et al.  Extending the database relational model to capture more meaning , 1979, ACM Trans. Database Syst..

[4]  Rajshekhar Sunderraman,et al.  Indefinite and maybe information in relational databases , 1990, TODS.

[5]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[6]  Norbert Fuhr,et al.  A Probabilistic NF2 Relational Algebra for Imprecision in Databases , 1997 .

[7]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[8]  Stig K. Andersen,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[9]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[10]  Amihai Motro,et al.  Management of uncertainty in database systems , 1995 .

[11]  William C. Purdy,et al.  A Logic for Natural Language , 1991, Notre Dame J. Formal Log..

[12]  Tomasz Imielinski,et al.  Incomplete object—a data model for design and planning applications , 1991, SIGMOD '91.

[13]  Daniel Kahneman,et al.  Probabilistic reasoning , 1993 .

[14]  Leslie G. Valiant,et al.  The Complexity of Enumeration and Reliability Problems , 1979, SIAM J. Comput..

[15]  Roger Barga,et al.  Proceedings of the 22nd International Conference on Data Engineering Workshops, ICDE 2006, 3-7 April 2006, Atlanta, GA, USA , 2006, ICDE Workshops.

[16]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[17]  Limsoon Wong,et al.  Semantic representations and query languages for or-sets , 1993, PODS '93.

[18]  B. Buckles,et al.  A fuzzy representation of data for relational databases , 1982 .

[19]  Richard S. Varga,et al.  Proof of Theorem 5 , 1983 .

[20]  Gösta Grahne Horn tables-an efficient tool for handling incomplete information in databases , 1989, PODS '89.

[21]  FuhrNorbert,et al.  A probabilistic relational algebra for the integration of information retrieval and database systems , 1997 .

[22]  Suk Kyoon Lee,et al.  An Extended Relational Database Model for Uncertain and Imprecise Information , 1992, VLDB.

[23]  Aristides Gionis,et al.  Automated Ranking of Database Query Results , 2003, CIDR.

[24]  Norbert Fuhr,et al.  A Probabilistic Framework for Vague Queries and Imprecise Information in Databases , 1990, VLDB.

[25]  Serge Abiteboul,et al.  On the Representation and Querying of Sets of Possible Worlds , 1991, Theor. Comput. Sci..

[26]  Pedro M. Domingos,et al.  Dynamic Probabilistic Relational Models , 2003, IJCAI.

[27]  Gerhard Weikum,et al.  The XXL search engine: ranked retrieval of XML data using indexes and ontologies , 2002, SIGMOD '02.

[28]  Laks V. S. Lakshmanan,et al.  ProbView: a flexible probabilistic database system , 1997, TODS.

[29]  Hector Garcia-Molina,et al.  The Management of Probabilistic Data , 1992, IEEE Trans. Knowl. Data Eng..

[30]  LINDA G. DEMICHIEL,et al.  Resolving Database Incompatibility: An Approach to Performing Relational Operations over Mismatched Domains , 1989, IEEE Trans. Knowl. Data Eng..

[31]  Jennifer Widom,et al.  Trio: A System for Integrated Management of Data, Accuracy, and Lineage , 2004, CIDR.

[32]  Gösta Grahne,et al.  Dependency Satisfaction in Databases with Incomplete Information , 1984, VLDB.

[33]  Norbert Fuhr,et al.  A probabilistic relational algebra for the integration of information retrieval and database systems , 1997, TOIS.

[34]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.

[35]  Renate A. Schmidt,et al.  Relational Grammars for Knowledge Representation , 2000 .