A uniform methodology for extracting type conflicts and subscheme similarities from heterogeneous databases

Abstract Cooperative Information Systems have been proposed to allow a uniform access to heterogeneous data yet preserving their operational autonomy. They use global dictionaries defined on the basis of interscheme properties; these include nominal and structural properties, type conflicts and object cluster similarities. Whereas in the literature a certain number of techniques has been proposed for deriving nominal and structural properties, few approaches exist for detecting type conflicts and object cluster similarities. The type of an object indicates if it is an entity, a relationship or an attribute; type conflicts indicate the existence of objects representing the same concept yet having different types. Object cluster similarities denote similitudes between portions of different schemes. This paper proposes an automatic, probabilistic approach to the detection of type conflicts and object cluster similarities in database schemes. The method we are proposing here is based on considering pairs of objects having different types (resp., pairs of clusters), belonging to different schemes and on measuring their similarity. To this purpose object (resp., cluster) structures as well as object (resp., cluster) neighborhoods are analyzed to verify similitudes and differences. A number of examples shows the suitability of our techniques to effectively detect type conflicts and object cluster similarities.

[1]  Maurizio Lenzerini,et al.  Representing and Using Interschema Knowledge in Cooperative Information Systems , 1993, Int. J. Cooperative Inf. Syst..

[2]  Arnon Rosenthal,et al.  Using semantic values to facilitate interoperability among heterogeneous information systems , 1994, TODS.

[3]  Shamkant B. Navathe,et al.  Conceptual Database Design: An Entity-Relationship Approach , 1991 .

[4]  Stefano Spaccapietra,et al.  View Integration: A Step Forward in Solving Structural Conflicts , 1994, IEEE Trans. Knowl. Data Eng..

[5]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[6]  ZVI GALIL,et al.  Efficient algorithms for finding maximum matching in graphs , 1986, CSUR.

[7]  Wolfgang Effelsberg,et al.  Matching techniques in global schema design , 1984, 1984 IEEE First International Conference on Data Engineering.

[8]  Peter C. Lockemann,et al.  System Guided View Integration for Object-Oriented Databases , 1992, IEEE Trans. Knowl. Data Eng..

[9]  Silvana Castano,et al.  Conceptual schema analysis: techniques and applications , 1998, TODS.

[10]  Luigi Palopoli,et al.  Semi-Automatic Techniques for Deriving Interscheme Properties from Database Schemes , 1999, Data Knowl. Eng..

[11]  Ali R. Hurson,et al.  Automated resolution of semantic heterogeneity in multidatabases , 1994, TODS.

[12]  James A. Larson,et al.  Integrating User Views in Database Design , 1986, Computer.

[13]  Maurizio Lenzerini,et al.  A Methodology for Data Schema Integration in the Entity Relationship Model , 1984, IEEE Transactions on Software Engineering.

[14]  Luigi Palopoli,et al.  A unified graph-based framework for deriving nominal interscheme properties, type conflicts and object cluster similarities , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[15]  Luigi Palopoli,et al.  An automatic technique for detecting type conflicts in database schemes , 1998, CIKM '98.

[16]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[17]  Erich J. Neuhold,et al.  Semantic vs. structural resemblance of classes , 1991, SGMD.

[18]  Jeffrey D. Ullman,et al.  Information integration using logical views , 1997, Theor. Comput. Sci..

[19]  Silvana Castano,et al.  Semantic dictionary design for database interoperability , 1997, Proceedings 13th International Conference on Data Engineering.

[20]  Jennifer Widom,et al.  Research problems in data warehousing , 1995, CIKM '95.

[21]  Silvana Castano,et al.  Reference Conceptual Architectures for Re-Engineering Information Systems , 1995, Int. J. Cooperative Inf. Syst..

[22]  Gio Wiederhold,et al.  Mediators in the architecture of future information systems , 1992, Computer.

[23]  James A. Larson,et al.  A Theory of Attribute Equivalence in Databases with Application to Schema Integration , 1989, IEEE Trans. Software Eng..