On using historical update information for instance identification in federated databases

To support database interoperability in federated databases systems, it is critical to be able to identify (potentially) equivalent data instances from individual autonomous database components. Since the components in a federation are autonomous, their data may be updated asynchronously, viz., modifications to a real world entity may be captured in different databases at different times; the authors term this effect update heterogeneity. Existing approaches largely base data instance similarity identification only on current attribute/property values; in the face of update heterogeneity, this is inadequate. They present an approach to address the problem of update heterogeneity in the federated databases context. They employ a probabilistic model, which utilizes historical database update information to estimate the degree of similarity between candidate data instances from different database components. They employ transaction history (log) information to this end, which is typically already available in the component database systems. They have experimentally implemented and tested this approach within the context of a prototype experimental federated databases system, FeXpress.

[1]  Dennis McLeod,et al.  An Approach to Resolving Semantic Heterogenity in a Federation of Autonomous, Heterogeneous Database Systems , 1993, Int. J. Cooperative Inf. Syst..

[2]  Dennis McLeod,et al.  The design and experimental evaluation of an information discovery mechanism for networks of autonomous database systems , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[3]  Shashi Shekhar,et al.  Resolving attribute incompatibility in database integration: an evidential reasoning approach , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[4]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[5]  Gultekin Özsoyoglu,et al.  Temporal and Real-Time Databases: A Survey , 1995, IEEE Trans. Knowl. Data Eng..

[6]  Arbee L. P. Chen,et al.  Querying uncertain data in heterogeneous databases , 1993, Proceedings RIDE-IMS `93: Third International Workshop on Research Issues in Data Engineering: Interoperability in Multidatabase Systems.

[7]  Norbert Fuhr,et al.  A Probabilistic Framework for Vague Queries and Imprecise Information in Databases , 1990, VLDB.

[8]  Arie Segev,et al.  Approximate matchings in scientific databases , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[9]  Dennis McLeod,et al.  Remote-Exchange: an approach to controlled sharing among autonomous, heterogeneous database systems , 1991, COMPCON Spring '91 Digest of Papers.

[10]  Dennis McLeod,et al.  An intelligent system for identifying and integrating non-local objects in federated database systems , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.