Dictionary of disease ontologies (DODO): a graph database to facilitate access and interaction with disease and phenotype ontologies

The formal, hierarchical classification of diseases and phenotypes in ontologies facilitates the connection to various biomedical databases (drugs, drug targets, genetic variant, literature information...). Connecting these resources is complicated by the use of heterogeneous disease definitions, and differences in granularity and structure. Despite ongoing efforts on integration, two challenges remain: (1) no resource provides a complete mapping across the multitude of disease ontologies and (2) there is no software available to comprehensively explore and interact with disease ontologies. In this paper, the DODO (Dictionary of Disease Ontology) database and R package are presented. DODO aims to deal with these two challenges by constructing a meta-database incorporating information of different publicly available disease ontologies. Thanks to the graph implementation, DODO allows the identification of indirect cross-references by allowing some relationships to be transitive. The R package provides several functions to build and interact with disease networks or convert identifiers between ontologies. They specifically aim to facilitate the integration of information from life science databases without the need to harmonize these upfront. The workflow for local adaptation and extension of the DODO database and a docker image with a DODO database instance are available.

[1]  Charles Auffray,et al.  Navigating the disease landscape: knowledge representations for contextualizing molecular signatures , 2018, Briefings Bioinform..

[2]  Qing-Yu He,et al.  DOSE: an R/Bioconductor package for disease ontology semantic and enrichment analysis , 2015, Bioinform..

[3]  Lawrence Hunter,et al.  KaBOB: ontology-based semantic integration of biomedical databases , 2015, BMC Bioinformatics.

[4]  Michel Dumontier,et al.  Towards quantitative measures in applied ontology , 2012, ArXiv.

[5]  Gang Fu,et al.  Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data , 2014, Nucleic Acids Res..

[6]  Doron Lancet,et al.  MalaCards: an integrated compendium for diseases and their annotation , 2013, Database J. Biol. Databases Curation.

[7]  Tudor Groza,et al.  The Monarch Initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species , 2016, bioRxiv.

[8]  Andrew R. Leach,et al.  ChEMBL: towards direct deposition of bioassay data , 2018, Nucleic Acids Res..

[9]  Chunlei Liu,et al.  ClinVar: improving access to variant interpretations and supporting evidence , 2017, Nucleic Acids Res..

[10]  L. Schriml,et al.  The Disease Ontology: fostering interoperability between biological and clinical human disease-related data , 2015, Mammalian Genome.

[11]  Anna Zhukova,et al.  Modeling sample variables with an Experimental Factor Ontology , 2010, Bioinform..

[12]  Wei Hu,et al.  BioSearch: a semantic search engine for Bio2RDF , 2017, Database J. Biol. Databases Curation.

[13]  Stefan Decker,et al.  Linked Biomedical Dataspace: Lessons Learned Integrating Data for Drug Discovery , 2014, SEMWEB.

[14]  Peter N. Robinson,et al.  A Census of Disease Ontologies , 2018, Annual Review of Biomedical Data Science.

[15]  Guohua Wang,et al.  SIDD: A Semantically Integrated Database towards a Global View of Human Disease , 2013, PloS one.