Entity Identi cation in Database Integration

The objective of entity identiication is to determine the correspondence between object instances from more than one database. This paper examines the problem at the instance level assuming that schema level het-erogeneity has been resolved a priori. Soundness and completeness are deened as the desired properties of any entity identiication technique. To achieve sound-ness, a set of identity and distinctness rules are established for entities in the integrated world. We propose the use of extended key, which is the union of keys (and possibly other attributes) from the relations to be matched, and its corresponding identity rule, to determine the equivalence between tuples from relations which may not share any common key. Instance level functional dependencies (ILFD), a form of semantic constraint information about the real-world entities, are used to derive the missing extended key attribute values of a tuple.

[1]  Drew McDermott,et al.  Non-Monotonic Logic I , 1987, Artif. Intell..

[2]  Umeshwar Dayal,et al.  Processing Queries Over Generalization Hierarchies in a Multidatabase System , 1983, VLDB.

[3]  James A. Larson,et al.  A Theory of Attribute Equivalence in Databases with Application to Schema Integration , 1989, IEEE Trans. Software Eng..

[4]  Stuart E. Madnick,et al.  The inter-database instance identification problem in integrating autonomous systems , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[5]  Linda G. DeMichiel,et al.  Performing operations over mismatched domains , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[6]  Clement T. Yu,et al.  Determining relationships among attributes for interoperability of multi-database systems , 1991, [1991] Proceedings. First International Workshop on Interoperability in Multidatabase Systems.

[7]  Weimin Du,et al.  The Pegasus heterogeneous multidatabase system , 1991, Computer.

[8]  Arie Segev,et al.  Data manipulation in heterogeneous databases , 1991, SGMD.

[9]  C. Pu Key equivalence in heterogeneous databases , 1991, [1991] Proceedings. First International Workshop on Interoperability in Multidatabase Systems.

[10]  Elisa Bertino,et al.  Integration of heterogeneous data repositories by using object-oriented views , 1991, [1991] Proceedings. First International Workshop on Interoperability in Multidatabase Systems.

[11]  Arnon Rosenthal,et al.  How to extend a conventional optimizer to handle one- and two-sided outerjoin , 1992, [1992] Eighth International Conference on Data Engineering.