A Literature-Based Knowledge Graph Embedding Method for Identifying Drug Repurposing Opportunities in Rare Diseases

One in ten people are affected by rare diseases, and three out of ten children with rare diseases will not live past age five. However, the small market size of individual rare diseases, combined with the time and capital requirements of pharmaceutical R&D, have hindered the development of new drugs for these cases. A promising alternative is drug repurposing, whereby existing FDA-approved drugs might be used to treat diseases different from their original indications. In order to generate drug repurposing hypotheses in a systematic and comprehensive fashion, it is essential to integrate information from across the literature of pharmacology, genetics, and pathology. To this end, we leverage a newly developed knowledge graph, the Global Network of Biomedical Relationships (GNBR). GNBR is a large, heterogeneous knowledge graph comprising drug, disease, and gene (or protein) entities linked by a small set of semantic themes derived from the abstracts of biomedical literature. We apply a knowledge graph embedding method that explicitly models the uncertainty associated with literature-derived relationships and uses link prediction to generate drug repurposing hypotheses. This approach achieves high performance on a gold-standard test set of known drug indications (AUROC = 0.89) and is capable of generating novel repurposing hypotheses, which we independently validate using external literature sources and protein interaction networks. Finally, we demonstrate the ability of our model to produce explanations of its predictions.

[1]  Alexander A. Morgan,et al.  Discovery and Preclinical Validation of Drug Indications Using Compendia of Public Gene Expression Data , 2011, Science Translational Medicine.

[2]  D. Haber,et al.  Wilms tumor and the WT1 gene. , 2001, Experimental cell research.

[3]  Russ B. Altman,et al.  A global network of biomedical relationships derived from text , 2018, Bioinform..

[4]  A. Paller,et al.  Methotrexate: new uses for an old drug. , 2014, The Journal of pediatrics.

[5]  Wolfram Weckwerth,et al.  Chronic signaling via the metabolic checkpoint kinase mTORC1 induces macrophage granuloma formation and marks sarcoidosis progression , 2016, Nature Immunology.

[6]  H. Ghofrani,et al.  Sildenafil: from angina to erectile dysfunction to pulmonary hypertension and beyond , 2006, Nature Reviews Drug Discovery.

[7]  Stuart J. Nelson,et al.  Normalized names for clinical drugs: RxNorm at 6 years , 2011, J. Am. Medical Informatics Assoc..

[8]  Jie Li,et al.  Review of Drug Repositioning Approaches and Resources , 2018, International journal of biological sciences.

[9]  Rainer Schneider,et al.  A Unique Family of Neuronal Signaling Proteins Implicated in Oncogenesis and Tumor Suppression , 2019, Front. Oncol..

[10]  Chi-Ying F. Huang,et al.  Trifluoperazine, an antipsychotic agent, inhibits cancer stem cell growth and overcomes drug resistance of lung cancer. , 2012, American journal of respiratory and critical care medicine.

[11]  Albert-László Barabási,et al.  Network-based prediction of drug combinations , 2019, Nature Communications.

[12]  W. Marsden I and J , 2012 .

[13]  P. Sanseau,et al.  Drug repurposing: progress, challenges and recommendations , 2018, Nature Reviews Drug Discovery.

[14]  Lijuan Zhang,et al.  The expression of IL-6 and STAT3 might predict progression and unfavorable prognosis in Wilms' tumor. , 2013, Biochemical and biophysical research communications.

[15]  S. Maiella,et al.  Orphanet et son réseau : où trouver une information validée sur les maladies rares , 2013 .

[16]  Daniel A Culver,et al.  A concise review of pulmonary sarcoidosis. , 2011, American journal of respiratory and critical care medicine.

[17]  Justin K. Huang,et al.  Typing tumors using pathways selected by somatic evolution , 2018, Nature Communications.

[18]  Jure Leskovec,et al.  Modeling polypharmacy side effects with graph convolutional networks , 2018, bioRxiv.

[19]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..