Learning to map between ontologies on the semantic web

Ontologies play a prominent role on the Semantic Web. They make possible the widespread publication of machine understandable data, opening myriad opportunities for automated information processing. However, because of the Semantic Web's distributed nature, data on it will inevitably come from many different ontologies. Information processing across ontologies is not possible without knowing the semantic mappings between their elements. Manually finding such mappings is tedious, error-prone, and clearly not possible at the Web scale. Hence, the development of tools to assist in the ontology mapping process is crucial to the success of the Semantic Web.We describe glue, a system that employs machine learning techniques to find such mappings. Given two ontologies, for each concept in one ontology glue finds the most similar concept in the other ontology. We give well-founded probabilistic definitions to several practical similarity measures, and show that glue can work with all of them. This is in contrast to most existing approaches, which deal with a single similarity measure. Another key feature of glue is that it uses multiple learning strategies, each of which exploits a different type of information either in the data instances or in the taxonomic structure of the ontologies. To further improve matching accuracy, we extend glue to incorporate commonsense knowledge and domain constraints into the matching process. For this purpose, we show that relaxation labeling, a well-known constraint optimization technique used in computer vision and other fields, can be adapted to work efficiently in our context. Our approach is thus distinguished in that it works with a variety of well-defined similarity notions and that it efficiently incorporates multiple types of knowledge. We describe a set of experiments on several real-world domains, and show that glue proposes highly accurate semantic mappings.

[1]  S. A. Lloyd An optimization approach to relaxation labelling algorithms , 1983, Image Vis. Comput..

[2]  Nathalie Pernelle,et al.  Automatic Construction and Refinement of a Class Hierarchy over Semi-Structured Data , 2001, Workshop on Ontology Learning.

[3]  Ian Horrocks,et al.  Enabling knowledge representation on the Web by extending RDF schema , 2001, WWW '01.

[4]  Alexander Maedche,et al.  A Machine Learning Perspective for the Semantic Web , 2001 .

[5]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[6]  Mark A. Musen,et al.  Anchor-PROMPT: Using Non-Local Context for Semantic Matching , 2001, OIS@IJCAI.

[7]  Erhard Rahm,et al.  Generic Schema Matching with Cupid , 2001, VLDB.

[8]  Laura M. Haas,et al.  Data-driven understanding and refinement of schema mappings , 2001, SIGMOD '01.

[9]  Mark A. Musen,et al.  PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment , 2000, AAAI/IAAI.

[10]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[11]  Steven W. Zucker,et al.  On the Foundations of Relaxation Labeling Processes , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[13]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[14]  Ryutaro Ichise,et al.  Rule Induction for Concept Hierarchy Alignment , 2001, Workshop on Ontology Learning.

[15]  Lluís Padró,et al.  A Hybrid Environment for Syntax-Semantic Tagging , 1998, ArXiv.

[16]  Dieter Fensel,et al.  Ontologies: A silver bullet for knowledge management and electronic commerce , 2002 .

[17]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[18]  Hans Chalupsky,et al.  OntoMorph: A Translation System for Symbolic Knowledge , 2000, KR.

[19]  Prasenjit Mitra,et al.  Semi-automatic Integration of Knowledge Sources , 1999 .

[20]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[21]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[22]  Borys Omelayenko,et al.  Learning of Ontologies from the Web: the Analysis of Existent Approaches , 2001, WebDyn@ICDT.

[23]  Michael Uschold Where Are the Semantics in the Semantic Web? , 2003, AI Mag..

[24]  Erhard Rahm,et al.  On Matching Schemas Automatically , 2001 .

[25]  Tova Milo,et al.  Using Schema Matching to Simplify Heterogeneous Data Translation , 1998, VLDB.

[26]  Diego Calvanese,et al.  Ontology of Integration and Integration of Ontologies , 2001, Description Logics.

[27]  Georg Groh,et al.  Facilitating the Exchange of Explicit Knowledge through Ontology Mappings , 2001, FLAIRS.

[28]  James A. Hendler,et al.  A Portrait of the Semantic Web in Action , 2001, IEEE Intell. Syst..

[29]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[30]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[31]  Ian H. Witten,et al.  Issues in Stacked Generalization , 2011, J. Artif. Intell. Res..

[32]  Deborah L. McGuinness,et al.  The Chimaera Ontology Environment , 2000, AAAI/IAAI.