Automated resolution of semantic heterogeneity in multidatabases

A multidatabase system provides integrated access to heterogeneous, autonomous local databases in a distributed system. An important problem in current multidatabase systems is identification of semantically similar data in different local databases. The Summary Schemas Model (SSM) is proposed as an extension to multidatabase systems to aid in semantic identification. The SSM uses a global data structure to abstract the information available in a multidatabase system. This abstracted form allows users to use their own terms (imprecise queries) when accessing data rather than being forced to use system-specified terms. The system uses the global data structure to match the user's terms to the semantically closest available system terms. A simulation of the SSM is presented to compare imprecise-query processing with corresponding query-processing costs in a standard multidatabase system. The costs and benefits of the SSM are discussed, and future research directions are presented.

[1]  Dutch ROGET'S THESAURUS , 1979 .

[2]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[3]  Ali R. Hurson,et al.  Specialized Parallel Architectures for Textual Databases , 1990, Adv. Comput..

[4]  Ali R. Hurson,et al.  Parallel architectures for databases systems , 1989 .

[5]  W. Litwin,et al.  Dynamic attributes in the multidatabase system MRPSM , 1986, 1986 IEEE Second International Conference on Data Engineering.

[6]  Yael Ravin Disambiguating and Interpreting Verb Definitions , 1993, Natural Language Processing.

[7]  Ming-Chien Shan,et al.  Multidatabase management in Pegasus , 1991, [1991] Proceedings. First International Workshop on Interoperability in Multidatabase Systems.

[8]  Yuri Breitbart,et al.  ADDS - Heterogeneous Distributed Database System , 1984, DDSS.

[9]  Witold Litwin,et al.  An overview of the multidatabase system MRDSM , 1985, ACM '85.

[10]  David A. Bell,et al.  EDDS—a system to harmonize access to heterogeneous databases on distributed micros and mainframes , 1987 .

[11]  S. Ceri,et al.  Distributed database design methodologies , 1987, Proceedings of the IEEE.

[12]  Yael Ravin,et al.  Tools for Lexicographers Revising an On-Line Thesaurus , 1988 .

[13]  Ali R. Hurson,et al.  Parallel Architectures for Database Systems , 1989, Adv. Comput..

[14]  Witold Litwin,et al.  Multidatabase Interoperability , 1986, Computer.

[15]  Stephen Fox,et al.  Heterogeneous distributed database systems for production use , 1990, CSUR.

[16]  Ali R. Hurson,et al.  Multidatabase Systems: An Advanced Concept in Handling Distributed Data , 1991, Adv. Comput..

[17]  A. R. Hurson,et al.  Linguistic support for semantic identification and interpretation in multidatabases , 1991, [1991] Proceedings. First International Workshop on Interoperability in Multidatabase Systems.

[18]  Bert R. Boyce Vocabulary control for information retrieval , 1987 .

[19]  W Staniszkis Integrating heterogeneous databases , 1986 .

[20]  M S Tuttle,et al.  From meaning to term: semantic locality in the UMLS Metathesaurus. , 1991, Proceedings. Symposium on Computer Applications in Medical Care.

[21]  Yuri Breitbart,et al.  Multidatabase Interoperability , 1990, SGMD.

[22]  Tommaso Mostardi,et al.  An Overview of the Distributed Query System DQS , 1988, EDBT.

[23]  Padmini Srivasan,et al.  Thesaurus Construction , 1992, Information Retrieval: Data Structures & Algorithms.

[24]  Umeshwar Dayal,et al.  View Definition and Generalization for Database Integration in Multibase: A System for Heterogeneous Distributed Databases , 1982, Berkeley Workshop.

[25]  Betty Kirkpatrick,et al.  Roget's Thesaurus , 1852 .

[26]  Padmini Srinivasan,et al.  Thesaurus Construction , 1992, Information Retrieval: Data Structures & Algorithms.

[27]  R. MacGregor,et al.  Mermaid—A front-end to distributed heterogeneous databases , 1987, Proceedings of the IEEE.

[28]  Stefano Ceri,et al.  Distributed Databases: Principles and Systems , 1984 .

[29]  S. Misbah Deen,et al.  Implementation of a Prototype for PRECI , 1987, Comput. J..

[30]  Susan T. Dumais,et al.  The vocabulary problem in human-system communication , 1987, CACM.

[31]  Witold Litwin,et al.  From Database Systems to Multidatabase Systems: Why and How , 1988, BNCOD.

[32]  M. Carl Drott,et al.  A thesaurus for end-user indexing and retrieval , 1986, Inf. Process. Manag..

[33]  Amihai Motro,et al.  Accommodating imprecision in database systems: issues and solutions , 1990, SGMD.

[34]  Clement T. Yu,et al.  Determining relationships among attributes for interoperability of multi-database systems , 1991, [1991] Proceedings. First International Workshop on Interoperability in Multidatabase Systems.

[35]  Martin Chodorow,et al.  A Tool For Investigating Tile Synonymy Relation In A Sense Disambiguated Thesaurus , 1988, ANLP.

[36]  Michael E. Lesk What to do when there's too much information , 1989, Hypertext.

[37]  Norbert Fuhr,et al.  A Probabilistic Framework for Vague Queries and Imprecise Information in Databases , 1990, VLDB.

[38]  Gregory Piatetsky-Shapiro,et al.  CALIDA: A System for Integrated Retrieval from Multiple Heterogeneous Databases , 1988, JCDKB.

[39]  Roy J. Byrd Discovering Relationships among Word Senses , 1994 .

[40]  W. Litwin,et al.  An overview of the multi-database manipulation language MDSL , 1987, Proceedings of the IEEE.

[41]  Amihai Motro,et al.  A Trio of Database User Interfaces for Handling Vague Retrieval Requests , 1989, IEEE Data Eng. Bull..

[42]  Ali R. Hurson,et al.  A taxonomy and current issues in multidatabase systems , 1992, Computer.

[43]  Stuart E. Madnick,et al.  The inter-database instance identification problem in integrating autonomous systems , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[44]  Umeshwar Dayal,et al.  View Definition and Generalization for Database Integration in a Multidatabase System , 1984, IEEE Transactions on Software Engineering.