Inexact Matching of Ontology Graphs Using Expectation-Maximization

We present a new method for mapping ontology schemas that address similar domains. The problem of ontology matching is crucial since we are witnessing a decentralized development and publication of ontological data. We formulate the problem of inferring a match between two ontologies as a maximum likelihood problem, and solve it using the technique of expectation-maximization (EM). Specifically, we adopt directed graphs as our model for ontology schemas and use a generalized version of EM to arrive at a map between the nodes of the graphs. We exploit the structural, lexical and instance similarity between the graphs, and differ from the previous approaches in the way we utilize them to arrive at, a possibly inexact, match. Inexact matching is the process of finding a best possible match between the two graphs when exact matching is not possible or is computationally difficult. In order to scale the method to large ontologies, we identify the computational bottlenecks and adapt the generalized EM by using a memory bounded partitioning scheme. We provide comparative experimental results in support of our method on two well-known ontology alignment benchmarks and discuss their implications.

[1]  Charles E. McCulloch,et al.  The EM Algorithm and Its Extensions , 1998 .

[2]  Claudio Gutiérrez,et al.  Bipartite Graphs as Intermediate Model for RDF , 2004, SEMWEB.

[3]  Arnon Rosenthal,et al.  Tuning Schema Matching Software using Synthetic Scenarios , 2005, VLDB.

[4]  Yun Peng,et al.  A Bayesian Methodology towards Automatic Ontology Mapping , 2005, AAAI 2005.

[5]  Pedro M. Domingos,et al.  Learning to map between ontologies on the semantic web , 2002, WWW '02.

[6]  Tova Milo,et al.  Using Schema Matching to Simplify Heterogeneous Data Translation , 1998, VLDB.

[7]  Deborah L. McGuinness,et al.  OWL Web ontology language overview , 2004 .

[8]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[9]  Jérôme Euzenat,et al.  Similarity-Based Ontology Alignment in OWL-Lite , 2004, ECAI.

[10]  Arnon Rosenthal,et al.  eTuner: tuning schema matching software using synthetic scenarios , 2007, The VLDB Journal.

[11]  John Li LOM: A Lexicon-based Ontology Mapping Tool , 2004 .

[12]  Erhard Rahm,et al.  COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.

[13]  Yannis Kalfoglou,et al.  Centre for Intelligent Systems and Their Applications , 2006 .

[14]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[15]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[16]  John H. Gennari,et al.  Leveraging an Alignment between two Large Ontologies : FMA and GO , 2004 .

[17]  José L. V. Mejino,et al.  A reference ontology for biomedical informatics: the Foundational Model of Anatomy , 2003, J. Biomed. Informatics.

[18]  Edwin R. Hancock,et al.  Structural Graph Matching Using the EM Algorithm and Singular Value Decomposition , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Fausto Giunchiglia,et al.  S-Match: an Algorithm and an Implementation of Semantic Matching , 2004, ESWS.

[20]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[21]  Pedro M. Domingos,et al.  iMAP: discovering complex semantic matches between database schemas , 2004, SIGMOD '04.

[22]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[23]  Alon Y. Halevy,et al.  Semantic Integration Research in the Database Community : A Brief Survey , 2005 .

[24]  Jérôme Euzenat,et al.  A Survey of Schema-Based Matching Approaches , 2005, J. Data Semant..

[25]  Edwin R. Hancock,et al.  Graph Matching With a Dual-Step EM Algorithm , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  S. Lauritzen The EM algorithm for graphical association models with missing data , 1995 .

[27]  Michel C. A. Klein,et al.  Structure-Based Partitioning of Large Concept Hierarchies , 2004, SEMWEB.

[28]  Yuzhong Qu,et al.  Constructing virtual documents for ontology matching , 2006, WWW '06.

[29]  Heiner Stuckenschmidt,et al.  Results of the Ontology Alignment Evaluation Initiative 2007 , 2006, OM.

[30]  Henrik Eriksson,et al.  The evolution of Protégé: an environment for knowledge-based systems development , 2003, Int. J. Hum. Comput. Stud..

[31]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[32]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[33]  Natalya F. Noy,et al.  Semantic integration: a survey of ontology-based approaches , 2004, SGMD.

[34]  Pradeep Ravikumar,et al.  A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.

[35]  Gerd Stumme,et al.  FCA-MERGE: Bottom-Up Merging of Ontologies , 2001, IJCAI.

[36]  Anuj R. Jaiswal,et al.  OMEN: A Probabilistic Ontology Mapping Tool , 2005, SEMWEB.

[37]  Yuzhong Qu,et al.  GMO: A Graph Matching for Ontologies , 2005, Integrating Ontologies.