DTiGEMS+: drug–target interaction prediction using graph embedding, graph mining, and similarity-based techniques

In silico prediction of drug–target interactions is a critical phase in the sustainable drug development process, especially when the research focus is to capitalize on the repositioning of existing drugs. However, developing such computational methods is not an easy task, but is much needed, as current methods that predict potential drug–target interactions suffer from high false-positive rates. Here we introduce DTiGEMS+, a computational method that predicts D rug– T arget i nteractions using G raph E mbedding, graph M ining, and S imilarity-based techniques. DTiGEMS+ combines similarity-based as well as feature-based approaches, and models the identification of novel drug–target interactions as a link prediction problem in a heterogeneous network. DTiGEMS+ constructs the heterogeneous network by augmenting the known drug–target interactions graph with two other complementary graphs namely: drug–drug similarity, target–target similarity. DTiGEMS+ combines different computational techniques to provide the final drug target prediction, these techniques include graph embeddings, graph mining, and machine learning. DTiGEMS+ integrates multiple drug–drug similarities and target–target similarities into the final heterogeneous graph construction after applying a similarity selection procedure as well as a similarity fusion algorithm. Using four benchmark datasets, we show DTiGEMS+ substantially improves prediction performance compared to other state-of-the-art in silico methods developed to predict of drug-target interactions by achieving the highest average AUPR across all datasets (0.92), which reduces the error rate by 33.3% relative to the second-best performing model in the state-of-the-art methods comparison.

[1]  Hyeon-Eui Kim,et al.  Deep mining heterogeneous networks of biomedical linked data to predict novel drug‐target associations , 2017, Bioinform..

[2]  Tao Jiang,et al.  ChemmineR: a compound mining framework for R , 2008, Bioinform..

[3]  Ivan G. Costa,et al.  A multiple kernel learning algorithm for drug-target interaction prediction , 2016, BMC Bioinformatics.

[4]  Sepp Hochreiter,et al.  Rchemcpp: a web service for structural analoging in ChEMBL, Drugbank and the Connectivity Map , 2015, Bioinform..

[5]  Yong Zhou,et al.  Computational Methods for the Prediction of Drug-Target Interactions from Drug Fingerprints and Protein Sequences by Stacked Auto-Encoder Deep Neural Network , 2017, ISBRA.

[6]  Abdollah Dehzangi,et al.  iDTI-ESBoost: Identification of Drug Target Interaction Using Evolutionary and Structural Features with Boosting , 2017, Scientific Reports.

[7]  Yanli Wang,et al.  PubChem: Integrated Platform of Small Molecules and Biological Activities , 2008 .

[8]  M Michael Gromiha,et al.  Drug-Target Interactions: Prediction Methods and Applications. , 2018, Current protein & peptide science.

[9]  Yanli Wang,et al.  Predicting drug-target interactions by dual-network integrated logistic matrix factorization , 2017, Scientific Reports.

[10]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[11]  Yong Zhou,et al.  Computational methods using weighed-extreme learning machine to predict protein self-interactions with protein evolutionary information , 2017, Journal of Cheminformatics.

[12]  S. Agatonovic-Kustrin,et al.  Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. , 2000, Journal of pharmaceutical and biomedical analysis.

[13]  Ping Zhang,et al.  Interpretable Drug Target Prediction Using Deep Neural Representation , 2018, IJCAI.

[14]  Jian Peng,et al.  A Network Integration Approach for Drug-Target Interaction Prediction and Computational Drug Repositioning from Heterogeneous Information , 2017, RECOMB 2017.

[15]  Lukasz Kurgan,et al.  Survey of Similarity-based Prediction of Drug-protein Interactions. , 2018, Current medicinal chemistry.

[16]  Jun Sese,et al.  Compound‐protein interaction prediction with end‐to‐end learning of neural networks for graphs and sequences , 2018, Bioinform..

[17]  Bin Chen,et al.  Predicting drug target interactions using meta-path-based semantic network analysis , 2016, BMC Bioinformatics.

[18]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[19]  Yoshihiro Yamanishi,et al.  Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework , 2010, Bioinform..

[20]  Robert B. Russell,et al.  SuperTarget and Matador: resources for exploring drug-target relationships , 2007, Nucleic Acids Res..

[21]  Reda Alhajj,et al.  Integration of k-means clustering algorithm with network analysis for drug-target interactions network prediction , 2018, Int. J. Data Min. Bioinform..

[22]  Yanqing Niu,et al.  Recent Advances in the Machine Learning-Based Drug-Target Interaction Prediction. , 2019, Current drug metabolism.

[23]  Alexander E. Ivliev,et al.  Drug Target Prediction and Repositioning Using an Integrated Network-Based Approach , 2013, PloS one.

[24]  Hojung Nam,et al.  DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences , 2018, PLoS Comput. Biol..

[25]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[26]  Thomas C. Wiegers,et al.  Comparative Toxicogenomics Database: a knowledgebase and discovery tool for chemical–gene–disease networks , 2008, Nucleic Acids Res..

[27]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[28]  R. W. Hansen,et al.  The price of innovation: new estimates of drug development costs. , 2003, Journal of health economics.

[29]  Tiratha Raj Singh,et al.  An integrative approach to develop computational pipeline for drug-target interaction network analysis , 2018, Scientific Reports.

[30]  Yoshihiro Yamanishi,et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces , 2008, ISMB.

[31]  José Luís Oliveira,et al.  Computational Discovery of Putative Leads for Drug Repositioning through Drug-Target Interaction Prediction , 2016, PLoS Comput. Biol..

[32]  Pingzhao Hu,et al.  Predicting drug-target interaction network using deep learning model , 2019, Comput. Biol. Chem..

[33]  György Kovács,et al.  Smote-variants: A python implementation of 85 minority oversampling techniques , 2019, Neurocomputing.

[34]  Shuigeng Zhou,et al.  Boosting compound-protein interaction prediction by deep learning , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[35]  Daniel R. Caffrey,et al.  Structure-based maximal affinity model predicts small-molecule druggability , 2007, Nature Biotechnology.

[36]  Chuang Liu,et al.  Prediction of Drug-Target Interactions and Drug Repositioning via Network-Based Inference , 2012, PLoS Comput. Biol..

[37]  Abhigyan Nath,et al.  Prediction of Human Drug Targets and Their Interactions Using Machine Learning Methods: Current and Future Perspectives. , 2018, Methods in molecular biology.

[38]  P. Bork,et al.  A side effect resource to capture phenotypic effects of drugs , 2010, Molecular systems biology.

[39]  Xin Gao,et al.  An integrated structure- and system-based framework to identify new targets of metabolites and known drugs , 2015, Bioinform..

[40]  J. Gready,et al.  Combining docking and molecular dynamic simulations in drug design , 2006, Medicinal research reviews.

[41]  Michal Magid-Slav,et al.  Identification of Common Biological Pathways and Drug Targets Across Multiple Respiratory Viruses Based on Human Host Gene Expression Analysis , 2012, PloS one.

[42]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[43]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[44]  Kevin Chen-Chuan Chang,et al.  A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[45]  Akira R. Kinjo,et al.  Neuro-symbolic representation learning on biological knowledge graphs , 2016, Bioinform..

[46]  Lukasz Kurgan,et al.  Survey of Similarity-based Prediction of Drug-protein Interactions. , 2018, Current medicinal chemistry.

[47]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[48]  Chee Keong Kwoh,et al.  Drug-Target Interaction Prediction with Graph Regularized Matrix Factorization , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[49]  David S. Wishart,et al.  T3DB: a comprehensively annotated database of common toxins and their targets , 2009, Nucleic Acids Res..

[50]  Yong-Yeol Ahn,et al.  Optimizing drug–target interaction prediction based on random walk on heterogeneous networks , 2015, Journal of Cheminformatics.

[51]  Ingrid Moerman,et al.  End-to-End Learning From Spectrum Data: A Deep Learning Approach for Wireless Signal Identification in Spectrum Monitoring Applications , 2017, IEEE Access.

[52]  Hui Zhang,et al.  Improved Prediction of Drug-Target Interactions Using Self-Paced Learning with Collaborative Matrix Factorization , 2019, J. Chem. Inf. Model..

[53]  Minoru Kanehisa,et al.  KEGG: new perspectives on genomes, pathways, diseases and drugs , 2016, Nucleic Acids Res..

[54]  Sameh K. Mohamed,et al.  Discovering protein drug targets using knowledge graph embeddings , 2019, Bioinform..

[55]  A. Barabasi,et al.  Drug—target network , 2007, Nature Biotechnology.

[56]  Yasuo Tabei,et al.  Network-based characterization of drug-protein interaction signatures with a space-efficient approach , 2019, BMC Systems Biology.

[57]  Chee-Keong Kwoh,et al.  Computational prediction of drug-target interactions using chemogenomic approaches: an empirical survey , 2019, Briefings Bioinform..

[58]  Chunyan Miao,et al.  Neighborhood Regularized Logistic Matrix Factorization for Drug-Target Interaction Prediction , 2016, PLoS Comput. Biol..

[59]  Alexander Tropsha,et al.  Best Practices for QSAR Model Development, Validation, and Exploitation , 2010, Molecular informatics.

[60]  David S. Wishart,et al.  DrugBank 5.0: a major update to the DrugBank database for 2018 , 2017, Nucleic Acids Res..

[61]  Chang Sun,et al.  Gradient Boosting Decision Tree-Based Method for Predicting Interactions Between Target Genes and Drugs , 2019, Front. Genet..

[62]  Artem Cherkasov,et al.  SimBoost: a read-across approach for predicting drug–target binding affinities using gradient boosting machines , 2017, Journal of Cheminformatics.

[63]  Manoj Kumar Gupta,et al.  A comprehensive review of feature based methods for drug target interaction prediction , 2019, J. Biomed. Informatics.

[64]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[65]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[66]  Yoshihiro Yamanishi,et al.  Supervised prediction of drug–target interactions using bipartite local models , 2009, Bioinform..

[67]  Min Chen,et al.  Revealing Drug-Target Interactions with Computational Models and Algorithms , 2019, Molecules.

[68]  T. Ashburn,et al.  Drug repositioning: identifying and developing new uses for existing drugs , 2004, Nature Reviews Drug Discovery.

[69]  BaldiPierre,et al.  2005 Speical Issue , 2005 .

[70]  Vladimir B. Bajic,et al.  Comparison Study of Computational Prediction Tools for Drug-Target Binding Affinities , 2019, Front. Chem..

[71]  Ulrich Bodenhofer,et al.  KeBABS: an R package for kernel-based analysis of biological sequences , 2015, Bioinform..

[72]  Palash Goyal,et al.  Graph Embedding Techniques, Applications, and Performance: A Survey , 2017, Knowl. Based Syst..

[73]  Vladimir B. Bajic,et al.  DDR: efficient computational method to predict drug–target interactions using graph mining and machine learning approaches , 2017, Bioinform..

[74]  Panos Kalnis,et al.  DASPfind: new efficient method to predict drug–target interactions , 2016, Journal of Cheminformatics.

[75]  Jin-Xing Liu,et al.  L2,1-GRMF: an improved graph regularized matrix factorization method to predict drug-target interactions , 2019, BMC Bioinformatics.

[76]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[77]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[78]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[79]  Joydeep Ghosh,et al.  Generative Oversampling for Mining Imbalanced Datasets , 2007, DMIN.

[80]  Hua Yu,et al.  A Systematic Prediction of Multiple Drug-Target Interactions from Chemical, Genomic, and Pharmacological Data , 2012, PloS one.

[81]  Arzucan Özgür,et al.  DeepDTA: deep drug–target binding affinity prediction , 2018, Bioinform..

[82]  Chee Keong Kwoh,et al.  Drug-target interaction prediction by learning from local information and neighbors , 2013, Bioinform..

[83]  Sampsa Hautaniemi,et al.  Fast Gene Ontology based clustering for microarray experiments , 2008, BioData Mining.

[84]  Xiao-Ying Yan,et al.  Prediction of Drug-Target Interaction with Graph Regularized Non-Negative Matrix Factorization , 2019 .

[85]  Lin He,et al.  Prediction of Drug-Target Interactions for Drug Repositioning Only Based on Genomic Expression Similarity , 2013, PLoS Comput. Biol..

[86]  Zhuowen Tu,et al.  Similarity network fusion for aggregating data types on a genomic scale , 2014, Nature Methods.

[87]  Sudipta Pathak,et al.  Ensemble learning algorithm for drug-target interaction prediction , 2017, ICCABS.

[88]  Hao Ding,et al.  Similarity-based machine learning methods for predicting drug-target interactions: a brief review , 2014, Briefings Bioinform..

[89]  Andreas Bender,et al.  Target prediction utilising negative bioactivity data covering large chemical space , 2015, Journal of Cheminformatics.

[90]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[91]  Yoshihiro Yamanishi,et al.  Drug target prediction using adverse event report systems: a pharmacogenomic approach , 2012, Bioinform..

[92]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[93]  Pierre Baldi,et al.  Graph kernels for chemical informatics , 2005, Neural Networks.

[94]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[95]  Masataka Kuroda,et al.  A novel descriptor based on atom-pair properties , 2017, Journal of Cheminformatics.

[96]  George Papadatos,et al.  The ChEMBL bioactivity database: an update , 2013, Nucleic Acids Res..

[97]  Tao Jiang,et al.  NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions , 2018, bioRxiv.

[98]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2017 , 2016, Nucleic Acids Res..