Comparison of Schema Matching Evaluations

Recently, schema matching has found considerable interest in both research and practice. Determining matching components of database or XML schemas is needed in many applications, e.g. for E-business and data integration. Various schema matching systems have been developed to solve the problem semi-automatically. While there have been some evaluations, the overall effectiveness of currently available automatic schema matching systems is largely unclear. This is because the evaluations were conducted in diverse ways making it difficult to assess the effectiveness of each single system, let alone to compare their effectiveness. In this paper we survey recently published schema matching evaluations. For this purpose, we introduce the major criteria that influence the effectiveness of a schema matching approach and use these criteria to compare the various systems. Based on our observations, we discuss the requirements for future match implementations and evaluations.

[1]  Laura M. Haas,et al.  Data-driven understanding and refinement of schema mappings , 2001, SIGMOD '01.

[2]  Amihai Motro,et al.  Database Schema Matching Using Machine Learning with Feature Selection , 2002, CAiSE.

[3]  Felix Naumann,et al.  Attribute classification using feature analysis , 2002, Proceedings 18th International Conference on Data Engineering.

[4]  Luigi Palopoli,et al.  The System DIKE: Towards the Semi-Automatic Synthesis of Cooperative Information Systems and Data Warehouses , 2000, ADBIS-DASFAA Symposium.

[5]  Joachim Biskup For unknown secrecies refusal is better than lying , 1999, Data Knowl. Eng..

[6]  Silvana Castano,et al.  A schema analysis and reconciliation tool environment for heterogeneous databases , 1999, Proceedings. IDEAS'99. International Database Engineering and Applications Symposium (Cat. No.PR00265).

[7]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[8]  David W. Embley,et al.  Multifaceted Exploitation of Metadata for Attribute Match Discovery in Information Integration , 2001, Workshop on Information Integration on the Web.

[9]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[10]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[11]  Felix Naumann,et al.  Schema Management , 2002, IEEE Data Eng. Bull..

[12]  Chris Clifton,et al.  Semantic Integration in Heterogeneous Databases Using Neural Networks , 1994, VLDB.

[13]  Ali R. Hurson,et al.  Automated resolution of semantic heterogeneity in multidatabases , 1994, TODS.

[14]  Laura M. Haas,et al.  The Clio project: managing heterogeneity , 2001, SGMD.

[15]  Klaus-Dieter Schewe,et al.  Integrating Database and Dialogue Design , 2000, Knowledge and Information Systems.

[16]  Erhard Rahm,et al.  Generic Schema Matching with Cupid , 2001, VLDB.

[17]  Amihai Motro,et al.  Autoplex: Automated Discovery of Content for Virtual Databases , 2001, CoopIS.

[18]  Erhard Rahm,et al.  COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.

[19]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[20]  Chris Clifton,et al.  Experience with a Combined Approach to Attribute-Matching Across Heterogeneous Databases , 1997, DS-7.

[21]  Pedro M. Domingos,et al.  Learning to map between ontologies on the semantic web , 2002, WWW '02.

[22]  Tova Milo,et al.  Using Schema Matching to Simplify Heterogeneous Data Translation , 1998, VLDB.

[23]  Bodo Rieger,et al.  Semantic Integration of Heterogeneous Information Sources , 2000, EFIS.

[24]  Chris Clifton,et al.  Database Integration Using Neural Networks: Implementation and Experiences , 2000, Knowledge and Information Systems.

[25]  Prasenjit Mitra,et al.  Semi-automatic Integration of Knowledge Sources , 1999 .

[26]  Chris Clifton,et al.  SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks , 2000, Data Knowl. Eng..