Answering heterogeneous database queries with degrees of uncertainty

In heterogeneous database systems,partial values have been used to resolve some schema integration problems. Performing operations on partial values may producemaybe tuples in the query result which cannot be compared. Thus, users have no way to distinguish which maybe tuple is the most possible answer. In this paper, the concept of partial values is generalized toprobabilistic partial values. We propose an approach to resolve the schema integration problems using probabilistic partial values and develop a full set of extended relational operators for manipulating relations containing probabilistic partial values. With this approach, the uncertain answer tuples of a query are associated with degrees of uncertainty (represented by probabilities). That provides users a comparison among maybe tuples and a better understanding on the query results. Besides, extended selection and join are generalized to α-selection and α-join, respectively, which can be used to filter out maybe tuples with low probabilities — those which have probabilities smaller than α.

[1]  Arbee L. P. Chen,et al.  Generalizing the Division Operation on Indefinite Databases , 1992, Future Databases.

[2]  Arbee L. P. Chen,et al.  Querying uncertain data in heterogeneous databases , 1993, Proceedings RIDE-IMS `93: Third International Workshop on Research Issues in Data Engineering: Interoperability in Multidatabase Systems.

[3]  Paul L. Meyer,et al.  Introductory Probability and Statistical Applications , 1970 .

[4]  Arbee L. P. Chen A Localized Approach to Distributed Query Processing , 1990, EDBT.

[5]  A. Zeroual,et al.  MSQL: A Multidatabase Language , 1989, Inf. Sci..

[6]  Arbee L. P. Chen Outerjoin optimization in multidatabase systems , 1990, DPDS '90.

[7]  John Grant,et al.  Partial Values in a Tabular Database Model , 1979, Inf. Process. Lett..

[8]  Arbee L. P. Chen,et al.  Searching a minimal semantically-equivalent subset of a set of partial values , 1993, The VLDB Journal.

[9]  E. F. Codd,et al.  Missing information (applicable and inapplicable) in relational databases , 1986, SGMD.

[10]  W. Litwin,et al.  An overview of the multi-database manipulation language MDSL , 1987, Proceedings of the IEEE.

[11]  LINDA G. DEMICHIEL,et al.  Resolving Database Incompatibility: An Approach to Performing Relational Operations over Mismatched Domains , 1989, IEEE Trans. Knowl. Data Eng..

[12]  Hector Garcia-Molina,et al.  A Probalilistic Relational Data Model , 1990, EDBT.

[13]  David W. Embley,et al.  An approach to schema integration and query formulation in federated database systems , 1987, 1987 IEEE Third International Conference on Data Engineering.

[14]  Yuri Breitbart,et al.  Multidatabase Interoperability , 1990, SGMD.

[15]  Amihai Motro,et al.  Superviews: Virtual Integration of Multiple Databases , 1987, IEEE Transactions on Software Engineering.

[16]  S. Misbah Deen,et al.  Data Integration in Distributed Databases , 1987, IEEE Transactions on Software Engineering.

[17]  Dennis McLeod,et al.  A federated architecture for information management , 1985, TOIS.

[18]  James A. Larson,et al.  A Theory of Attribute Equivalence in Databases with Application to Schema Integration , 1989, IEEE Trans. Software Eng..

[19]  Joachim Biskup,et al.  A foundation of CODD's relational maybe-operations , 1983, TODS.

[20]  Umeshwar Dayal,et al.  View Definition and Generalization for Database Integration in a Multidatabase System , 1984, IEEE Transactions on Software Engineering.

[21]  W. Litwin,et al.  Dynamic attributes in the multidatabase system MRPSM , 1986, 1986 IEEE Second International Conference on Data Engineering.

[22]  C. J. Date The Outer Join , 1983, ICOD.

[23]  Yuri Breitbart,et al.  Database integration in a distributed heterogeneous database system , 1986, 1986 IEEE Second International Conference on Data Engineering.

[24]  E. F. Codd,et al.  Extending the database relational model to capture more meaning , 1979, ACM Trans. Database Syst..