Exploring biomedical ontology mappings with graph theory methods

Background In the era of semantic web, life science ontologies play an important role in tasks such as annotating biological objects, linking relevant data pieces, and verifying data consistency. Understanding ontology structures and overlapping ontologies is essential for tasks such as ontology reuse and development. We present an exploratory study where we examine structure and look for patterns in BioPortal, a comprehensive publicly available repository of live science ontologies. Methods We report an analysis of biomedical ontology mapping data over time. We apply graph theory methods such as Modularity Analysis and Betweenness Centrality to analyse data gathered at five different time points. We identify communities, i.e., sets of overlapping ontologies, and define similar and closest communities. We demonstrate evolution of identified communities over time and identify core ontologies of the closest communities. We use BioPortal project and category data to measure community coherence. We also validate identified communities with their mutual mentions in scientific literature. Results With comparing mapping data gathered at five different time points, we identified similar and closest communities of overlapping ontologies, and demonstrated evolution of communities over time. Results showed that anatomy and health ontologies tend to form more isolated communities compared to other categories. We also showed that communities contain all or the majority of ontologies being used in narrower projects. In addition, we identified major changes in mapping data after migration to BioPortal Version 4.

[1]  Christopher Brewster,et al.  Ontologies for crisis management: A review of state of the art in ontology design and usability , 2013, ISCRAM.

[2]  Abraham Bernstein,et al.  The Semantic Web - ISWC 2009, 8th International Semantic Web Conference, ISWC 2009, Chantilly, VA, USA, October 25-29, 2009. Proceedings , 2009, SEMWEB.

[3]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[4]  Stefan Seedorf,et al.  Applications of Ontologies in Software Engineering , 2006 .

[5]  Mark A. Musen,et al.  What Four Million Mappings Can Tell You about Two Hundred Ontologies , 2009, SEMWEB.

[6]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..

[7]  Mark A. Musen,et al.  Creating Mappings For Ontologies in Biomedicine: Simple Methods Work , 2009, AMIA.

[8]  Juancarlos Chan,et al.  Gene Ontology Consortium: going forward , 2014, Nucleic Acids Res..

[9]  Nargiza Bekmamedova,et al.  An Ontology-Driven Approach Applied to Information Security , 2010, J. Res. Pract. Inf. Technol..

[10]  Jean-Loup Guillaume,et al.  Static community detection algorithms for evolving networks , 2010, 8th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks.

[11]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[12]  Bart Selman,et al.  Tracking evolving communities in large linked networks , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Sébastien Moretti,et al.  Bgee: Integrating and Comparing Heterogeneous Transcriptome Data Among Species , 2008, DILS.

[14]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[15]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[16]  J E Backus,et al.  MEDLINEplus: building and maintaining the National Library of Medicine's consumer health Web service. , 2000, Bulletin of the Medical Library Association.

[17]  K. Bretonnel Cohen,et al.  BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains , 2014, Journal of Biomedical Semantics.

[18]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[19]  Erhard Rahm,et al.  COnto-Diff: generation of complex evolution mappings for life science ontologies , 2013, J. Biomed. Informatics.

[20]  Sean Bechhofer,et al.  The OWL API: A Java API for OWL ontologies , 2011, Semantic Web.

[21]  Jin-Dong Kim,et al.  Visualizing ontology mappings to help ontology engineers identify relevant ontologies for their reuse , 2013, ICBO.

[22]  Erhard Rahm,et al.  GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution , 2011, J. Biomed. Semant..

[23]  Carole A. Goble,et al.  Towards BioDBcore: a community-defined information specification for biological databases , 2011, Database : the journal of biological databases and curation.

[24]  P Topalis,et al.  SHORT NOTE: AnoBase: a genetic and biological database of anophelines , 2005, Insect molecular biology.

[25]  Laurent Gil,et al.  Ensembl 2013 , 2012, Nucleic Acids Res..

[26]  Michele Magrane,et al.  UniProt Knowledgebase: a hub of integrated protein data , 2011, Database J. Biol. Databases Curation.

[27]  Michael Schroeder,et al.  GoPubMed: exploring PubMed with the Gene Ontology , 2005, Nucleic Acids Res..