Evaluation of Similarity Measures and Heuristics for Simple RDF Schema Matching

Schema matching is a fundamental issue in database applications, such as query mediation and data warehousing. In this paper, we assume that each database schema to be matched is described in RDF, and contains only class definitions and property definitions whose ranges are XML Schema simple types. We propose and compare RDF property matching heuristics based on similarity functions, applied to sets of observed values. We describe experimental results that show that customized contrast models induce good quality RDF property matchings.

[1]  AnHai Doan,et al.  Corpus-based schema matching , 2005, 21st International Conference on Data Engineering (ICDE'05).

[2]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[3]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[4]  Marco A. Casanova,et al.  Semantic Web: Concepts, Technologies and Applications , 2007, NASA Monographs in Systems and Software Engineering.

[5]  A. Tversky Features of Similarity , 1977 .

[6]  Wei-Ying Ma,et al.  Instance-based Schema Matching for Web Databases by Domain-specific Query Probing , 2004, VLDB.

[7]  Marco A. Casanova,et al.  An Instance-based Approach for Matching Export Schemas of Geographical Database Web Services , 2007, GEOINFO.

[8]  Jayant Madhavan,et al.  Web-Scale Data Integration: You can afford to Pay as You Go , 2007, CIDR.

[9]  Philip A. Bernstein,et al.  Model management 2.0: manipulating richer mappings , 2007, SIGMOD '07.

[10]  Donald Hindle,et al.  Noun Classification From Predicate-Argument Structures , 1990, ACL.

[11]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[12]  Hong Tang,et al.  Similarity Measures for Satellite Images with Heterogeneous Contents , 2007, 2007 Urban Remote Sensing Joint Event.

[13]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[14]  Marco A. Casanova,et al.  Adaptative Matching of Database Web Services Export Schemas , 2008, ICEIS.

[15]  Jayant Madhavan,et al.  Web-Scale Data Integration: You can afford to Pay as You Go , 2007, CIDR.

[16]  Horst M. Eidenberger,et al.  Visual similarity measurement with the feature contrast model , 2003, IS&T/SPIE Electronic Imaging.

[17]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[18]  Ruy Luiz Milidiú,et al.  Mediation as Recommendation: An Approach to Design Mediators for Object Catalogs , 2006, OTM Workshops.

[19]  Myoung-Ho Kim,et al.  Information Retrieval Based on Conceptual Distance in is-a Hierarchies , 1993, J. Documentation.

[20]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[21]  Felix Naumann,et al.  Schema matching using duplicates , 2005, 21st International Conference on Data Engineering (ICDE'05).

[22]  Ian Witten,et al.  Data Mining , 2000 .

[23]  Horst M. Eidenberger,et al.  Evaluation and analysis of similarity measures for content-based visual information retrieval , 2006, Multimedia Systems.