Bio2RDF Release 2: Improved Coverage, Interoperability and Provenance of Life Science Linked Data

Bio2RDF currently provides the largest network of Linked Data for the Life Sciences. Here, we describe a significant update to increase the overall quality of RDFized datasets generated from open scripts powered by an API to generate registry-validated IRIs, dataset provenance and metrics, SPARQL endpoints, downloadable RDF and database files. We demonstrate federated SPARQL queries within and across the Bio2RDF network, including semantic integration using the Semanticscience Integrated Ontology (SIO). This work forms a strong foundation for increased coverage and continuous integration of data in the life sciences.

[1]  J C Schaff,et al.  Integrating BioPAX pathway knowledge with SBML models. , 2009, IET systems biology.

[2]  Huajun Chen,et al.  Semantic Web meets Integrative Biology: a survey , 2013, Briefings Bioinform..

[3]  Rafael Berlanga Llavori,et al.  Exploring and linking biomedical resources through multidimensional semantic spaces , 2012, BMC Bioinformatics.

[4]  Bin Chen,et al.  Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data , 2010, BMC Bioinformatics.

[5]  Nicolas Le Novère,et al.  Identifiers.org and MIRIAM Registry: community resources to provide persistent identification , 2011, Nucleic Acids Res..

[6]  Anne E. Trefethen,et al.  Toward interoperable bioscience data , 2012, Nature Genetics.

[7]  Michel Dumontier,et al.  Integrating systems biology models and biomedical ontologies , 2011, BMC Systems Biology.

[8]  Carole A. Goble,et al.  State of the nation in data integration for bioinformatics , 2008, J. Biomed. Informatics.

[9]  Michel Dumontier,et al.  Interoperability between Biomedical Ontologies through Relation Expansion, Upper-Level Ontologies and Automatic Reasoning , 2011, PloS one.

[10]  Alan Ruttenberg,et al.  Life sciences on the Semantic Web: the Neurocommons and beyond , 2009, Briefings Bioinform..

[11]  Winston A Hide,et al.  Big data: The future of biocuration , 2008, Nature.

[12]  Peter Ansell,et al.  Model and prototype for querying multiple linked scientific datasets , 2011, Future Gener. Comput. Syst..

[13]  John H. Gennari,et al.  Multiple ontologies in action: Composite annotations for biosimulation models , 2011, J. Biomed. Informatics.

[14]  Gary D. Bader,et al.  cPath: open source software for collecting, storing, and querying biological pathways , 2006, BMC Bioinformatics.

[15]  Mark A. Musen,et al.  NCBO Resource Index: Ontology-based search and mining of biomedical resources , 2010, J. Web Semant..

[16]  Giovanni Tummarello,et al.  Introducing RDF Graph Summary with Application to Assisted SPARQL Formulation , 2012, 2012 23rd International Workshop on Database and Expert Systems Applications.