An Evidential Reasoning Approach to Attribute Value Conflict Resolution in Database Integration

Resolving domain incompatibility among independently developed databases often involves uncertain information. DeMichiel (1989) showed that uncertain information can be generated by the mapping of conflicting attributes to a common domain, based on some domain knowledge. We show that uncertain information can also arise when the database integration process requires information not directly represented in the component databases, but can be obtained through some summary of data. We therefore propose an extended relational model based on Dempster-Shafer theory of evidence to incorporate such uncertain knowledge about the source databases. The extended relation uses evidence sets to represent uncertainty in information, which allow probabilities to be attached to subsets of possible domain values. We also develop a full set of extended relational operations over the extended relations. In particular, an extended union operation has been formalized to combine two extended relations using Dempster's rule of combination. The closure and boundedness properties of our proposed extended operations are formulated. We also illustrate the use of extended operations by some query examples.

[1]  James F. Baldwin,et al.  Evidential support logic programming , 1987 .

[2]  Judea Pearl,et al.  Bayesian and belief-functions formalisms for evidential reasoning: a conceptual analysis , 1990 .

[3]  Suk Kyoon Lee,et al.  Imprecise and uncertain information in databases: an evidential approach , 1992, [1992] Eighth International Conference on Data Engineering.

[4]  LINDA G. DEMICHIEL,et al.  Resolving Database Incompatibility: An Approach to Performing Relational Operations over Mismatched Domains , 1989, IEEE Trans. Knowl. Data Eng..

[5]  Witold Litwin,et al.  Multidatabase Interoperability , 1986, Computer.

[6]  Jaideep Srivastava,et al.  Entity identification in database integration , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[7]  Arie Segev,et al.  Data manipulation in heterogeneous databases , 1991, SGMD.

[8]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[9]  Hector Garcia-Molina,et al.  The Management of Probabilistic Data , 1992, IEEE Trans. Knowl. Data Eng..

[10]  Rangasami L. Kashyap,et al.  Belief combination and propagation in a lattice-structured interference network , 1990, IEEE Trans. Syst. Man Cybern..

[11]  Umeshwar Dayal,et al.  Processing Queries Over Generalization Hierarchies in a Multidatabase System , 1983, VLDB.

[12]  Yoshikane Takahashi Fuzzy Database Query Languages and Their Relational Completeness Theorem , 1993, IEEE Trans. Knowl. Data Eng..

[13]  James A. Larson,et al.  A Theory of Attribute Equivalence in Databases with Application to Schema Integration , 1989, IEEE Trans. Software Eng..