Prediction of Drug–Target Interactions From Multi-Molecular Network Based on Deep Walk Embedding Model

Predicting drug–target interactions (DTIs) is crucial in innovative drug discovery, drug repositioning and other fields. However, there are many shortcomings for predicting DTIs using traditional biological experimental methods, such as the high-cost, time-consumption, low efficiency, and so on, which make these methods difficult to widely apply. As a supplement, the in silico method can provide helpful information for predictions of DTIs in a timely manner. In this work, a deep walk embedding method is developed for predicting DTIs from a multi-molecular network. More specifically, a multi-molecular network, also called molecular associations network, is constructed by integrating the associations among drug, protein, disease, lncRNA, and miRNA. Then, each node can be represented as a behavior feature vector by using a deep walk embedding method. Finally, we compared behavior features with traditional attribute features on an integrated dataset by using various classifiers. The experimental results revealed that the behavior feature could be performed better on different classifiers, especially on the random forest classifier. It is also demonstrated that the use of behavior information is very helpful for addressing the problem of sequences containing both self-interacting and non-interacting pairs of proteins. This work is not only extremely suitable for predicting DTIs, but also provides a new perspective for the prediction of other biomolecules’ associations.

[1]  David S Wishart,et al.  Computational systems biology in drug discovery and development: methods and applications. , 2007, Drug discovery today.

[2]  P J Quesenberry,et al.  The Network , 2019, Leonardo.

[3]  Jeremy J. W. Chen,et al.  Multiple target drug cocktail design for attacking the core network markers of four cancers using ligand-based and structure-based virtual screening methods , 2015, BMC Medical Genomics.

[4]  Min Li,et al.  HNEDTI: Prediction of drug-target interaction based on heterogeneous network embedding , 2019, 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[5]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[6]  Adrià Cereto-Massagué,et al.  Molecular fingerprint similarity search in virtual screening. , 2015, Methods.

[7]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[8]  Keith C. C. Chan,et al.  Large-scale prediction of drug-target interactions from deep representations , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[9]  Damian Szklarczyk,et al.  The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible , 2016, Nucleic Acids Res..

[10]  Xingming Zhao,et al.  Computational Systems Biology , 2013, TheScientificWorldJournal.

[11]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12]  Jian Peng,et al.  A Network Integration Approach for Drug-Target Interaction Prediction and Computational Drug Repositioning from Heterogeneous Information , 2017, RECOMB 2017.

[13]  Deli Zhao,et al.  Network Representation Learning with Rich Text Information , 2015, IJCAI.

[14]  Omer Levy,et al.  word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.

[15]  Yoshihiro Yamanishi,et al.  Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework , 2010, Bioinform..

[16]  Núria Queralt-Rosinach,et al.  DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants , 2016, Nucleic Acids Res..

[17]  Yongdong Zhang,et al.  Drug-target interaction prediction: databases, web servers and computational models , 2016, Briefings Bioinform..

[18]  Chengqi Zhang,et al.  Network Representation Learning: A Survey , 2017, IEEE Transactions on Big Data.

[19]  H. Kitano,et al.  Computational systems biology , 2002, Nature.

[20]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2019 , 2018, Nucleic Acids Res..

[21]  Yuan Zhou,et al.  HMDD v3.0: a database for experimentally supported human microRNA–disease associations , 2018, Nucleic Acids Res..

[22]  Qiong Zhang,et al.  lncRNASNP2: an updated database of functional SNPs and mutations in human and mouse lncRNAs , 2017, Nucleic Acids Res..

[23]  Samantha A. Morris,et al.  CellNet: Network Biology Applied to Stem Cell Engineering , 2014, Cell.

[24]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[25]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[26]  G. Glazko,et al.  Network biology: a direct approach to study biological function , 2011, Wiley interdisciplinary reviews. Systems biology and medicine.

[27]  Ulf Leser,et al.  Reflection of successful anticancer drug development processes in the literature. , 2016, Drug discovery today.

[28]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[29]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[30]  Qinghua Guo,et al.  LncRNA2Target v2.0: a comprehensive database for target genes of lncRNAs in human and mouse , 2018, Nucleic Acids Res..

[31]  David S. Wishart,et al.  DrugBank 5.0: a major update to the DrugBank database for 2018 , 2017, Nucleic Acids Res..

[32]  Hsien-Da Huang,et al.  miRTarBase update 2018: a resource for experimentally validated microRNA-target interactions , 2017, Nucleic Acids Res..

[33]  Dominique Lavenier,et al.  DSK: k-mer counting with very low memory usage , 2013, Bioinform..

[34]  A. Barabasi,et al.  Drug—target network , 2007, Nature Biotechnology.

[35]  Hugo Ceulemans,et al.  Large-scale comparison of machine learning methods for drug target prediction on ChEMBL , 2018, Chemical science.

[36]  Hai-Cheng Yi,et al.  Construction and Comprehensive Analysis of a Molecular Association Network via lncRNA–miRNA–Disease–Drug–Protein Graph , 2019, Cells.

[37]  Xing Chen,et al.  LncRNADisease: a database for long-non-coding RNA-associated diseases , 2012, Nucleic Acids Res..

[38]  Matthias Rarey,et al.  Torsion Library Reloaded: A New Version of Expert-Derived SMARTS Rules for Assessing Conformations of Small Molecules , 2016, J. Chem. Inf. Model..

[39]  Jonathan Knowles,et al.  A guide to drug discovery: Target selection in drug discovery , 2003, Nature Reviews Drug Discovery.

[40]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[41]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[42]  Yanqing Niu,et al.  Recent Advances in the Machine Learning-Based Drug-Target Interaction Prediction. , 2019, Current drug metabolism.

[43]  Johan A. K. Suykens,et al.  Bayesian Framework for Least-Squares Support Vector Machine Classifiers, Gaussian Processes, and Kernel Fisher Discriminant Analysis , 2002, Neural Computation.