Interconnection of Biological Knowledge Using NikkajiRDF and Interlinking Ontology for Biological Concepts

We investigated the interconnection on knowledge of biological molecules, biological phenomena, and diseases to efficiently collect information regarding the functions of chemical compounds and gene products, roles, applications, and involvements in diseases using knowledge graphs (KGs) developed from Resource Description Framework (RDF) data and ontologies. NikkajiRDF linked open data provide information on approximately 3.5 million chemical compounds and 694 application examples. We integrated NikkajiRDF with Interlinking Ontology for Biological Concepts (IOBC), including approximately 80,000 concepts, information on gene products, drugs, and diseases. Using IOBC’s ontological structure, we confirmed that this integration enabled us to infer new information regarding biological and chemical functions, applications, and involvements in diseases for 5038 chemical compounds. Furthermore, we developed KGs from IOBC and added protein, biological phenomena, and disease identifiers used in major biological databases: UniProt, Gene Ontology, and MeSH to the KGs. Using the extended KGs and federated search to the DisGeNET, we discovered more than 60 chemicals and 700 gene products, involved in 32 diseases.

[1]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[2]  Takahiro Kawamura,et al.  Inference of Functions, Roles, and Applications of Chemicals Using Linked Open Data and Ontologies , 2018, JIST.

[3]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[4]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[5]  Egon L. Willighagen,et al.  Scientific Lenses to Support Multiple Views over Linked Chemistry Data , 2014, SEMWEB.

[6]  Robert Petryszak,et al.  UniChem: a unified chemical structure cross-referencing and identifier tracking system , 2013, Journal of Cheminformatics.

[7]  Takahiro Kawamura,et al.  Refined JST Thesaurus Extended with Data from Other Open Life Science Data Sources , 2017, JIST.

[8]  John M. Hancock,et al.  Entity/quality-based logical definitions for the human skeletal phenome using PATO , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[9]  Akira R. Kinjo,et al.  Neuro-symbolic representation learning on biological knowledge graphs , 2016, Bioinform..

[10]  Núria Queralt-Rosinach,et al.  DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes , 2015, Database J. Biol. Databases Curation.

[11]  Christoph Steinbeck,et al.  The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013 , 2012, Nucleic Acids Res..

[12]  Atsushi Yoshiki,et al.  The mouse resources at the RIKEN BioResource center. , 2009, Experimental animals.

[13]  David S. Wishart,et al.  DrugBank 5.0: a major update to the DrugBank database for 2018 , 2017, Nucleic Acids Res..

[14]  J. Euzenat,et al.  Ontology Matching , 2007, Springer Berlin Heidelberg.

[15]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[16]  Isabel F. Cruz,et al.  Tackling the challenges of matching biomedical ontologies , 2018, J. Biomed. Semant..

[17]  Ryan Miller,et al.  WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research , 2017, Nucleic Acids Res..

[18]  Núria Queralt-Rosinach,et al.  The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery , 2014, J. Biomed. Semant..

[19]  Peter B. McGarvey,et al.  Infrastructure for the life sciences: design and implementation of the UniProt website , 2009, BMC Bioinformatics.

[20]  Stephen R. Heller,et al.  InChI - the worldwide chemical structure identifier standard , 2013, Journal of Cheminformatics.

[21]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[22]  Bin Chen,et al.  The ChEMBL database as linked open data , 2013, Journal of Cheminformatics.

[23]  Bin Chen,et al.  Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data , 2010, BMC Bioinformatics.

[24]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[25]  Mark A. Musen,et al.  Creating Mappings For Ontologies in Biomedicine: Simple Methods Work , 2009, AMIA.

[26]  Tatiana A. Tatusova,et al.  Gene: a gene-centered information resource at NCBI , 2014, Nucleic Acids Res..

[27]  Bin Chen,et al.  Assessing Drug Target Association Using Semantic Linked Data , 2012, PLoS Comput. Biol..

[28]  Hiroshi Masuya,et al.  RIKEN MetaDatabase: A Database Platform as a Microcosm of Linked Open Data Cloud in the Life Sciences , 2016, JIST.

[29]  Luc Patiny,et al.  Wikipedia Chemical Structure Explorer: substructure and similarity searching of molecules from Wikipedia , 2015, Journal of Cheminformatics.

[30]  Michelle Giglio,et al.  Human Disease Ontology 2018 update: classification, content and workflow expansion , 2018, Nucleic Acids Res..

[31]  Bernard Fritig,et al.  Downregulation of a Pathogen-Responsive Tobacco UDP-Glc:Phenylpropanoid Glucosyltransferase Reduces Scopoletin Glucoside Accumulation, Enhances Oxidative Stress, and Weakens Virus Resistance Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1105/tpc.010 , 2002, The Plant Cell Online.

[32]  C. Steinbeck,et al.  The Chemical Information Ontology: Provenance and Disambiguation for Chemical Data on the Biological Semantic Web , 2011, PloS one.

[33]  Michel Dumontier,et al.  Semantic Web integration of Cheminformatics resources with the SADI framework , 2011, J. Cheminformatics.

[34]  Egon L. Willighagen,et al.  PubChemRDF: towards the semantic annotation of PubChem compound and substance databases , 2015, Journal of Cheminformatics.

[35]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[36]  Takahiro Kawamura,et al.  Efficient Construction of a New Ontology for Life Sciences by Sub-classifying Related Terms in the Japan Science, Technology Agency Thesaurus , 2017, ICBO.

[37]  Alfredo Ferro,et al.  OCDB: a database collecting genes, miRNAs and drugs for obsessive-compulsive disorder , 2015, Database J. Biol. Databases Curation.

[38]  Akira R. Kinjo,et al.  Protein Data Bank Japan (PDBj): updated user interfaces, resource description framework, analysis tools for large structures , 2016, Nucleic Acids Res..

[39]  Michel Dumontier,et al.  SPARQL-enabled identifier conversion with Identifiers.org , 2015, Bioinform..

[40]  Peter Woollard,et al.  Ontology mapping for semantically enabled applications. , 2019, Drug discovery today.

[41]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2019 , 2018, Nucleic Acids Res..

[42]  Natalya F. Noy,et al.  BioPortal: Ontologies and Integrated Data Resources at the Click of a Mouse , 2009 .

[43]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[44]  Egon L. Willighagen,et al.  Linked open drug data for pharmaceutical research and development , 2011, J. Cheminformatics.

[45]  Olivier Bodenreider,et al.  Beyond synonymy: exploiting the UMLS semantics in mapping vocabularies , 1998, AMIA.