Ontology-Based Querying with Bio2RDF’s Linked Open Data

BackgroundA key activity for life scientists in this post “-omics” age involves searching for and integrating biological data from a multitude of independent databases. However, our ability to find relevant data is hampered by non-standard web and database interfaces backed by an enormous variety of data formats. This heterogeneity presents an overwhelming barrier to the discovery and reuse of resources which have been developed at great public expense.To address this issue, the open-source Bio2RDF project promotes a simple convention to integrate diverse biological data using Semantic Web technologies. However, querying Bio2RDF remains difficult due to the lack of uniformity in the representation of Bio2RDF datasets.ResultsWe describe an update to Bio2RDF that includes tighter integration across 19 new and updated RDF datasets. All available open-source scripts were first consolidated to a single GitHub repository and then redeveloped using a common API that generates normalized IRIs using a centralized dataset registry. We then mapped dataset specific types and relations to the Semanticscience Integrated Ontology (SIO) and demonstrate simplified federated queries across multiple Bio2RDF endpoints.ConclusionsThis coordinated release marks an important milestone for the Bio2RDF open source linked data framework. Principally, it improves the quality of linked data in the Bio2RDF network and makes it easier to access or recreate the linked data locally. We hope to continue improving the Bio2RDF network of linked data by identifying priority databases and increasing the vocabulary coverage to additional dataset vocabularies beyond SIO.

[1]  Ian M. Donaldson,et al.  iRefIndex: A consolidated protein interaction database with provenance , 2008, BMC Bioinformatics.

[2]  Mark D. Wilkinson,et al.  The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation , 2011, J. Biomed. Semant..

[3]  Barry Smith,et al.  Biodynamic ontology: applying BFO in the biomedical domain. , 2004, Studies in health technology and informatics.

[4]  Catherine Brooksbank,et al.  The European Bioinformatics Institute’s data resources , 2009, Nucleic Acids Res..

[5]  John H. Gennari,et al.  Multiple ontologies in action: Composite annotations for biosimulation models , 2011, J. Biomed. Informatics.

[6]  Egon L. Willighagen,et al.  Linked open drug data for pharmaceutical research and development , 2011, J. Cheminformatics.

[7]  Huajun Chen,et al.  Semantic Web meets Integrative Biology: a survey , 2013, Briefings Bioinform..

[8]  Michel Dumontier,et al.  Prototype semantic infrastructure for automated small molecule classification and annotation in lipidomics , 2011, BMC Bioinformatics.

[9]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[10]  C. Steinbeck,et al.  The Chemical Information Ontology: Provenance and Disambiguation for Chemical Data on the Biological Semantic Web , 2011, PloS one.

[11]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[12]  Tim Berners-Lee,et al.  Linked data , 2020, Semantic Web for the Working Ontologist.

[13]  Michel Dumontier,et al.  Interoperability between Biomedical Ontologies through Relation Expansion, Upper-Level Ontologies and Automatic Reasoning , 2011, PloS one.

[14]  Yu Lin,et al.  Ontology representation and analysis of vaccine formulation and administration and their effects on vaccine immune responses , 2012, Journal of Biomedical Semantics.

[15]  Bin Chen,et al.  Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data , 2010, BMC Bioinformatics.

[16]  Anne E. Trefethen,et al.  Toward interoperable bioscience data , 2012, Nature Genetics.

[17]  Lars Vogt,et al.  Accommodating Ontologies to Biological Reality—Top-Level Categories of Cumulative-Constitutively Organized Material Entities , 2012, PloS one.

[18]  Christian Lovis,et al.  Automatic medical encoding with SNOMED categories , 2008, BMC Medical Informatics Decis. Mak..

[19]  Michel Dumontier,et al.  Integrating systems biology models and biomedical ontologies , 2011, BMC Systems Biology.

[20]  Carole A. Goble,et al.  State of the nation in data integration for bioinformatics , 2008, J. Biomed. Informatics.

[21]  J C Schaff,et al.  Integrating BioPAX pathway knowledge with SBML models. , 2009, IET systems biology.

[22]  Michel Dumontier,et al.  Building an HIV data mashup using Bio2RDF , 2012, Briefings Bioinform..

[23]  Michele Magrane,et al.  UniProt Knowledgebase: a hub of integrated protein data , 2011, Database J. Biol. Databases Curation.

[24]  Alan Ruttenberg,et al.  Life sciences on the Semantic Web: the Neurocommons and beyond , 2009, Briefings Bioinform..

[25]  Rafael Berlanga Llavori,et al.  Exploring and linking biomedical resources through multidimensional semantic spaces , 2012, BMC Bioinformatics.

[26]  Michel Dumontier,et al.  Semantic Web integration of Cheminformatics resources with the SADI framework , 2011, J. Cheminformatics.

[27]  Mark A. Musen,et al.  The Open Biomedical Annotator , 2009, Summit on translational bioinformatics.

[28]  A. Rector,et al.  Relations in biomedical ontologies , 2005, Genome Biology.

[29]  Gary D. Bader,et al.  cPath: open source software for collecting, storing, and querying biological pathways , 2006, BMC Bioinformatics.

[30]  Mark A. Musen,et al.  NCBO Resource Index: Ontology-based search and mining of biomedical resources , 2010, J. Web Semant..

[31]  Jessica A. Turner,et al.  Modeling biomedical experimental processes with OBI , 2010, J. Biomed. Semant..

[32]  Paul Roe,et al.  Bio2RDF Network Of Linked Data , 2008 .