An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size

It has been a formidable task to achieve efficiency and scalability for the alignment between two massive, conceptually similar ontologies. Here we assume, an ontology is typically given in RDF (Resource Description Framework) or OWL (Web Ontology Language) and can be represented by a directed graph. A straightforward approach to the alignment of two ontologies entails an O(N^2) computation by comparing every combination of pairs of nodes from given two ontologies, where N denotes the average number of nodes in each ontology. Our proposed algorithm called Anchor-Flood algorithm, boasting of O([email protected]?(N)) computation on the average, starts off with an anchor, a pair of ''look-alike'' concepts from each ontology, gradually exploring concepts by collecting neighboring concepts, thereby taking advantage of locality of reference in the graph data structure. It outputs a set of alignments between concepts and properties within semantically connected subsets of two entire graphs, which we call segments. When similarity comparison between a pair of nodes in the directed graph has to be made to determine whether two given ontologies are aligned or not, we repeat the similarity comparison between a pair of nodes, within the neighborhood pairs of two ontologies surrounding the anchor iteratively until the algorithm meets that ''either all the collected concepts are explored, or no new aligned pair is found''. In this way, we can significantly reduce the computational time for the alignment. Moreover, since we only focus on segment-to-segment comparison, regardless of the entire size of ontologies, our algorithm not only achieves high performance, but also resolves the scalability problem in aligning ontologies. Our proposed algorithm reduces the number of seemingly-aligned but actually misaligned pairs. Through several examples with large ontologies, we will demonstrate the features of our Anchor-Food algorithm.

[1]  Marc Ehrig,et al.  Ontology Alignment: Bridging the Semantic Gap , 2006 .

[2]  Asunción Gómez-Pérez,et al.  Six challenges for the Semantic Web , 2002, KR 2002.

[3]  Michel C. A. Klein,et al.  Structure-Based Partitioning of Large Concept Hierarchies , 2004, SEMWEB.

[4]  Mark Fischetti,et al.  Weaving the web - the original design and ultimate destiny of the World Wide Web by its inventor , 1999 .

[5]  Fausto Giunchiglia,et al.  Semantic Matching: Algorithms and Implementation , 2007, J. Data Semant..

[6]  Jie Zhang,et al.  Towards Imaging Large-Scale Ontologies for Quick Understanding and Analysis , 2005, International Semantic Web Conference.

[7]  Pedro M. Domingos,et al.  iMAP: discovering complex semantic matches between database schemas , 2004, SIGMOD '04.

[8]  Baowen Xu,et al.  Lily: Ontology Alignment Results for OAEI 2008 , 2008, OM.

[9]  Masaki Aono,et al.  Alignment Results of Anchor-Flood Algorithm for OAEI-2008 , 2008, OM.

[10]  Alan L. Rector,et al.  Web ontology segmentation: analysis, classification and use , 2006, WWW '06.

[11]  Shensheng Zhang,et al.  Matching Large Scale Ontology Effectively , 2006, ASWC.

[12]  Dieter Fensel,et al.  Knowledge Engineering: Principles and Methods , 1998, Data Knowl. Eng..

[13]  Enrico Motta,et al.  DSSim Results for OAEI 2008 , 2008, OM.

[14]  Jérôme David,et al.  AROMA Results for OAEI 2009 , 2008, OM.

[15]  Jérôme Euzenat,et al.  Ten Challenges for Ontology Matching , 2008, OTM Conferences.

[16]  Richard Fikes,et al.  The Ontolingua Server: a tool for collaborative ontology construction , 1997, Int. J. Hum. Comput. Stud..

[17]  Feng Shi,et al.  RiMOM Results for OAEI 2009 , 2008, OM.

[18]  John Fox,et al.  The Knowledge Engineering Review , 1984, The Knowledge Engineering Review.

[19]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[20]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[21]  Jimmy Lin,et al.  Leveraging Recurrent Phrase Structure in Large-scale Ontology Translation , 2006, EAMT.

[22]  William E. Winkler,et al.  The State of Record Linkage and Current Research Problems , 1999 .

[23]  Yuzhong Qu,et al.  The Results of Falcon-AO in the OAEI 2006 Campaign , 2006, Ontology Matching.

[24]  Mansur R. Kabuka,et al.  ASMOV Results for OAEI 2007 , 2007, OM.

[25]  Deborah L. McGuinness,et al.  An Environment for Merging and Testing Large Ontologies , 2000, KR.

[26]  Yuzhong Qu,et al.  Matching large ontologies: A divide-and-conquer approach , 2008, Data Knowl. Eng..

[27]  Bijan Parsia,et al.  Automatic Partitioning of OWL Ontologies Using E-Connections , 2005, Description Logics.

[28]  Olivier Bodenreider,et al.  NLM Anatomical Ontology Alignment System. Results of the 2006 Ontology Alignment Contest , 2006, Ontology Matching.

[29]  Marc Ehrig Ontology Alignment: Bridging the Semantic Gap (Semantic Web and Beyond) , 2006 .

[30]  José L. V. Mejino,et al.  A reference ontology for biomedical informatics: the Foundational Model of Anatomy , 2003, J. Biomed. Informatics.

[31]  Yuzhong Qu,et al.  Partition-Based Block Matching of Large Class Hierarchies , 2006, ASWC.

[32]  Timothy W. Finin,et al.  KQML as an agent communication language , 1994, CIKM '94.

[33]  Chantal Reynaud,et al.  TaxoMap in the OAEI 2008 Alignment Contest , 2008, OM.

[34]  B. Hammond Ontology , 2004, Lawrence Booth’s Book of Visions.

[35]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[36]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[37]  Heiner Stuckenschmidt,et al.  Results of the Ontology Alignment Evaluation Initiative , 2007 .

[38]  Daniele Nardi,et al.  An Introduction to Description Logics , 2003, Description Logic Handbook.

[39]  Heiner Stuckenschmidt,et al.  Results of the Ontology Alignment Evaluation Initiative 2007 , 2006, OM.

[40]  Qiang Liu,et al.  SAMBO and SAMBOdtf Results for the Ontology Alignment Evaluation Initiative 2008 , 2008, OM.

[41]  Mark A. Musen,et al.  Anchor-PROMPT: Using Non-Local Context for Semantic Matching , 2001, OIS@IJCAI.

[42]  Stefanos D. Kollias,et al.  A String Metric for Ontology Alignment , 2005, SEMWEB.

[43]  Yi Li,et al.  RiMOM: A Dynamic Multistrategy Ontology Alignment Framework , 2009, IEEE Transactions on Knowledge and Data Engineering.

[44]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[45]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[46]  Masaki Aono,et al.  Automatic Alignment of Ontology Eliminating the Probable Misalignments , 2006, ASWC.

[47]  Erhard Rahm,et al.  Matching large schemas: Approaches and evaluation , 2007, Inf. Syst..

[48]  Umberto Straccia,et al.  Towards Distributed Information Retrieval in the Semantic Web: Query Reformulation Using the oMAP Framework , 2006, ESWC.