Semi-automatic, semantic discovery of properties from database schemes

An important tool for the integration of large federated database systems is a global dictionary describing all the involved schemes into an unified framework. The first step in the construction of such a dictionary is the discovery of the properties holding among objects in different schemes. This paper presents novel algorithms to discover possible synonyms, homonyms and inclusions. In addition, the paper also deals with another crucial step in the construction of a dictionary: schema integration. The approach proposed for this step exploits inter-schema properties discovered in the previous step to achieve schema integration. This approach is also concerned with producing suitable abstractions in order to structure the description of the global dictionary into a hierarchy of concepts in order to yield a more flexible, uniform view of the attached databases. The above two steps are interleaved with other steps (mainly devoted to interfacing with database administrators and to validating the discovered properties) and have been experimented with for the construction of a global dictionary for a large number of public administration database systems.

[1]  Ali R. Hurson,et al.  Automated resolution of semantic heterogeneity in multidatabases , 1994, TODS.

[2]  Nick Roussopoulos,et al.  Interoperability of multiple autonomous databases , 1990, CSUR.

[3]  Maurizio Lenzerini,et al.  A Methodology for Data Schema Integration in the Entity Relationship Model , 1984, IEEE Transactions on Software Engineering.

[4]  Silvana Castano,et al.  Semantic dictionary design for database interoperability , 1997, Proceedings 13th International Conference on Data Engineering.

[5]  Silvana Castano,et al.  Analysis of an inventory of information systems in the public administration , 1996, Requirements Engineering.

[6]  Stefano Spaccapietra,et al.  View Integration: A Step Forward in Solving Structural Conflicts , 1994, IEEE Trans. Knowl. Data Eng..

[7]  Maurizio Lenzerini,et al.  Representing and Using Interschema Knowledge in Cooperative Information Systems , 1993, Int. J. Cooperative Inf. Syst..

[8]  Erich J. Neuhold,et al.  Semantic vs. structural resemblance of classes , 1991, SGMD.

[9]  Heikki Mannila,et al.  A database perspective on knowledge discovery , 1996, CACM.

[10]  James A. Larson,et al.  A Theory of Attribute Equivalence in Databases with Application to Schema Integration , 1989, IEEE Trans. Software Eng..

[11]  Silvana Castano,et al.  Reference Conceptual Architectures for Re-Engineering Information Systems , 1995, Int. J. Cooperative Inf. Syst..

[12]  Stephen Fox,et al.  Heterogeneous distributed database systems for production use , 1990, ACM Comput. Surv..