Matching object catalogues

A catalogue holds information about a set of objects, typically classified using terms taken from a given thesaurus, and described with the help of a set of attributes. Matching a pair of catalogues means to find a relationship between the terms of their thesauri and a relationship between their attributes. This paper first introduces a matching approach, based on the notion of similarity, that applies to both thesauri and attribute matching. It then describes matchings based on mutual information and introduces variations that explore certain heuristics. Finally, it discusses experimental results that evaluate the precision of the matchings and that measure the influence of the heuristics.

[1]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[2]  Felix Naumann,et al.  Schema matching using duplicates , 2005, 21st International Conference on Data Engineering (ICDE'05).

[3]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[4]  Jayant Madhavan,et al.  Web-Scale Data Integration: You can afford to Pay as You Go , 2007, CIDR.

[5]  Ruy Luiz Milidiú,et al.  Mediation as Recommendation: An Approach to Design Mediators for Object Catalogs , 2006, OTM Workshops.

[6]  Marco A. Casanova,et al.  Adaptative Matching of Database Web Services Export Schemas , 2008, ICEIS.

[7]  Myoung-Ho Kim,et al.  Information Retrieval Based on Conceptual Distance in is-a Hierarchies , 1993, J. Documentation.

[8]  Marco A. Casanova,et al.  An Instance-based Approach for Matching Export Schemas of Geographical Database Web Services , 2007, GEOINFO.

[9]  Wei-Ying Ma,et al.  Instance-based Schema Matching for Web Databases by Domain-specific Query Probing , 2004, VLDB.

[10]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[11]  Philip A. Bernstein,et al.  Model management 2.0: manipulating richer mappings , 2007, SIGMOD '07.

[12]  Mehran Sahami,et al.  Evaluating similarity measures: a large-scale study in the orkut social network , 2005, KDD '05.

[13]  AnHai Doan,et al.  Corpus-based schema matching , 2005, 21st International Conference on Data Engineering (ICDE'05).

[14]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[15]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[16]  James Frew,et al.  Geographic Names: The Implementation of a Gazetteer in a Georeferenced Digital Library , 1999, D Lib Mag..

[17]  A. Tversky Features of Similarity , 1977 .

[18]  R. Payne Geographic names information system , 1983 .

[19]  Andrew B. Whinston,et al.  Model management , 1994 .

[20]  Donald Hindle,et al.  Noun Classification From Predicate-Argument Structures , 1990, ACL.

[21]  Marco A. Casanova,et al.  Database Conceptual Schema Matching , 2007, Computer.

[22]  Jayant Madhavan,et al.  Web-Scale Data Integration: You can afford to Pay as You Go , 2007, CIDR.

[23]  RahmErhard,et al.  A survey of approaches to automatic schema matching , 2001, VLDB 2001.

[24]  Silvana Castano,et al.  Semantic Information Interoperability in Open Networked Systems , 2004, ICSNW.

[25]  Ruy Luiz Milidiú,et al.  Towards Gazetteer Integration Through an Instance-based Thesauri Mapping Approach , 2006, GEOINFO.