Matching Hierarchies Using Shared Objects

One of the main challenges in integrating two hierarchies (e.g., of books or web pages) is determining the correspondence between the edges of each hierarchy. Traditionally, this process, which we call hierarchy matching, is done by comparing the text associated with each edge. In this paper we instead use the placement of objects present in both hierarchies to infer how the hierarchies relate. We present two algorithms that, given a hierarchy with known facets(attribute-value pairs that define what objects are placed under an edge), determine feasible facets for a second hierarchy, based on shared objects. One algorithm is rule-based and the other is statistics-based. In the experimental section, we compare the results of the two algorithms, and see how their performances vary based on the amount of noise in the hierarchies.

[1]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[2]  Paul Brown,et al.  Toward Automated Large-Scale Information Integration and Discovery , 2005, Data Management in a Connected World.

[3]  Sunita Sarawagi,et al.  Cross-training: learning probabilistic mappings between topics , 2003, KDD '03.

[4]  Jennifer Widom,et al.  Swoosh: a generic approach to entity resolution , 2008, The VLDB Journal.

[5]  Ramakrishnan Srikant,et al.  On integrating catalogs , 2001, WWW '01.

[6]  Ryutaro Ichise,et al.  Rule Induction for Concept Hierarchy Alignment , 2001, Workshop on Ontology Learning.

[7]  Gerd Stumme,et al.  FCA-MERGE: Bottom-Up Merging of Ontologies , 2001, IJCAI.

[8]  Deborah L. McGuinness,et al.  An Environment for Merging and Testing Large Ontologies , 2000, KR.

[9]  Pedro M. Domingos,et al.  Learning to map between ontologies on the semantic web , 2002, WWW '02.

[10]  Mark A. Musen,et al.  PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment , 2000, AAAI/IAAI.

[11]  Laura M. Haas,et al.  Clio grows up: from research prototype to industrial tool , 2005, SIGMOD '05.

[12]  Fausto Giunchiglia,et al.  Semantic Schema Matching , 2005, OTM Conferences.

[13]  Yannis Kalfoglou,et al.  Ontology mapping: the state of the art , 2003, The Knowledge Engineering Review.

[14]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[15]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.