DNN-DTIs: improved drug-target interactions prediction using XGBoost feature selection and deep neural network

Research, analysis, and prediction of drug-target interactions (DTIs) play an important role in understanding drug mechanisms, drug repositioning and design. Machine learning (ML)-based methods for DTIs prediction can mitigate the shortcomings of time-consuming and labor-intensive experimental approaches, providing new ideas and insights for drug design. We propose a novel pipeline for predicting drug-target interactions, called DNN-DTIs. First, the target information is characterized by pseudo-amino acid composition, pseudo position-specific scoring matrix, conjoint triad, composition, transition and distribution, Moreau-Broto autocorrelation, and structure feature. Then, the drug compounds are encoded using substructure fingerprint. Next, we utilize XGBoost to determine nonredundant and important feature subset, then the optimized and balanced sample vectors could be obtained through SMOTE. Finally, a DTIs predictor, DNN-DTIs, is developed based on deep neural network (DNN) via layer-by-layer learning. Experimental results indicate that DNN-DTIs achieves outstanding performance than other predictors with the ACC values of 98.78%, 98.60%, 97.98%, 98.24% and 98.00% on Enzyme, Ion Channels (IC), GPCR, Nuclear Receptors (NR) and Kuang's dataset. Therefore, DNN-DTIs's accurate prediction performance on Network1 and Network2 make it logical choice for contributing to the study of DTIs, especially, the drug repositioning and new usage of old drugs.

[1]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[2]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[3]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[5]  Menglan Cai,et al.  Drug Target Prediction by Multi-View Low Rank Embedding , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[7]  Yoshihiro Yamanishi,et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces , 2008, ISMB.

[8]  Elisa Michelini,et al.  Protein ligand interaction prediction , 2012 .

[9]  Juwen Shen,et al.  Predicting protein–protein interactions based only on sequences information , 2007, Proceedings of the National Academy of Sciences.

[10]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[11]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[12]  Jean-Philippe Vert,et al.  Protein-ligand interaction prediction: an improved chemogenomics approach , 2008, Bioinform..

[13]  Hui Zhang,et al.  Improved Prediction of Drug-Target Interactions Using Self-Paced Learning with Collaborative Matrix Factorization , 2019, J. Chem. Inf. Model..

[14]  Andreas Bender,et al.  Melting Point Prediction Employing k-Nearest Neighbor Algorithms and Genetic Parameter Optimization , 2006, J. Chem. Inf. Model..

[15]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[16]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[17]  Xiaoying Wang,et al.  Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique , 2018, Bioinform..

[18]  Kuldip K. Paliwal,et al.  Capturing non‐local interactions by long short‐term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility , 2017, Bioinform..

[19]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[20]  Jie Li,et al.  SDTNBI: an integrated network and chemoinformatics tool for systematic prediction of drug–target interactions and drug repositioning , 2016, Briefings Bioinform..

[21]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[22]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[23]  Gerard Pujadas,et al.  Protein-ligand Docking: A Review of Recent Advances and Future Perspectives , 2008 .

[24]  Anne Gogny,et al.  Aglepristone: A review on its clinical use in animals. , 2016, Theriogenology.

[25]  Yong Zhou,et al.  Prediction of Drug–Target Interaction Networks from the Integration of Protein Sequences and Drug Chemical Structures , 2017, Molecules.

[26]  Chunyan Miao,et al.  Neighborhood Regularized Logistic Matrix Factorization for Drug-Target Interaction Prediction , 2016, PLoS Comput. Biol..

[27]  Arzucan Özgür,et al.  DeepDTA: deep drug–target binding affinity prediction , 2018, Bioinform..

[28]  Jie Li,et al.  Prediction of Polypharmacological Profiles of Drugs by the Integration of Chemical, Side Effect, and Therapeutic Space , 2013, J. Chem. Inf. Model..

[29]  D. Horne,et al.  Prediction of protein helix content from an autocorrelation analysis of sequence hydrophobicities , 1988, Biopolymers.

[30]  Xiangxiang Zeng,et al.  Network-based prediction of drug-target interactions using an arbitrary-order proximity embedded deep forest , 2020, Bioinform..

[31]  Yongdong Zhang,et al.  Drug-target interaction prediction: databases, web servers and computational models , 2016, Briefings Bioinform..

[32]  Chee Keong Kwoh,et al.  Drug-target interaction prediction by learning from local information and neighbors , 2013, Bioinform..

[33]  Yoshihiro Yamanishi,et al.  Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework , 2010, Bioinform..

[34]  David J. Kopsky,et al.  Phenytoin: 80 years young, from epilepsy to breast cancer, a remarkable molecule with multiple modes of action , 2017, Journal of Neurology.

[35]  Yan Li,et al.  An eigenvalue transformation technique for predicting drug-target interaction , 2015, Scientific Reports.

[36]  C. Kunte,et al.  Comparison of the efficacy and safety of topical minoxidil and topical alfatradiol in the treatment of androgenetic alopecia in women , 2007, Journal der Deutschen Dermatologischen Gesellschaft = Journal of the German Society of Dermatology : JDDG.

[37]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[38]  Kayvan Najarian,et al.  Machine learning approaches and databases for prediction of drug–target interaction: a survey paper , 2020, Briefings Bioinform..

[39]  Dong-Sheng Cao,et al.  Large-scale prediction of drug-target interactions using protein sequences and drug topological structures. , 2012, Analytica chimica acta.

[40]  Mitchell M. Tseng,et al.  Attribute selection for product configurator design based on Gini index , 2014 .

[41]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[42]  Jijun Tang,et al.  Identification of Drug-Target Interactions via Dual Laplacian Regularized Least Squares with Multiple Kernel Fusion , 2020, Knowl. Based Syst..

[43]  LeeGary Geunbae,et al.  Information gain and divergence-based feature selection for machine learning-based text categorization , 2006 .

[44]  J. Woroń,et al.  Progestogens in menopausal hormone therapy , 2015, Przeglad menopauzalny = Menopause review.

[45]  P. Celada,et al.  Pindolol augmentation of antidepressant response. , 2006, Current drug targets.

[46]  Hojung Nam,et al.  DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences , 2018, PLoS Comput. Biol..

[47]  Liujuan Cao,et al.  A novel features ranking metric with application to scalable visual and bioinformatics data classification , 2016, Neurocomputing.

[48]  Xin Gao,et al.  DTiGEMS+: drug–target interaction prediction using graph embedding, graph mining, and similarity-based techniques , 2020, Journal of Cheminformatics.

[49]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[50]  Abdollah Dehzangi,et al.  iDTI-ESBoost: Identification of Drug Target Interaction Using Evolutionary and Structural Features with Boosting , 2017, Scientific Reports.

[51]  I. Muchnik,et al.  Prediction of protein folding class using global description of amino acid sequence. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[52]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[53]  Robert B. Russell,et al.  SuperTarget and Matador: resources for exploring drug-target relationships , 2007, Nucleic Acids Res..

[54]  S. Opella,et al.  Structure determination of membrane proteins by nuclear magnetic resonance spectroscopy. , 2013, Annual review of analytical chemistry.

[55]  Jiajie Peng,et al.  Identifying drug-target interactions based on graph convolutional network and deep neural network , 2020, Briefings Bioinform..

[56]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[57]  Xing Chen,et al.  In silico prediction of drug-target interaction networks based on drug chemical structure and protein sequences , 2017, Scientific Reports.

[58]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[59]  Jacqueline Capeau,et al.  HIV antiretroviral drugs, dolutegravir, maraviroc and ritonavir-boosted atazanavir use different pathways to affect inflammation, senescence and insulin sensitivity in human coronary endothelial cells , 2020, PloS one.

[60]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[61]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[62]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[63]  Alan Wee-Chung Liew,et al.  Sequence-Based Prediction of Protein-Carbohydrate Binding Sites Using Support Vector Machines. , 2016, Journal of chemical information and modeling.

[64]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[65]  ZhaoPeilin,et al.  Drug-Target Interaction Prediction with Graph Regularized Matrix Factorization , 2017 .

[66]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.

[67]  CHUN WEI YAP,et al.  PaDEL‐descriptor: An open source software to calculate molecular descriptors and fingerprints , 2011, J. Comput. Chem..

[68]  Chee-Keong Kwoh,et al.  Computational prediction of drug-target interactions using chemogenomic approaches: an empirical survey , 2019, Briefings Bioinform..

[69]  Vladimir B. Bajic,et al.  DDR: efficient computational method to predict drug–target interactions using graph mining and machine learning approaches , 2017, Bioinform..

[70]  Hiroyuki Ogata,et al.  AAindex: Amino Acid Index Database , 1999, Nucleic Acids Res..

[71]  Stephen H. Bryant,et al.  Improved prediction of drug-target interactions using regularized least squares integrating with kernel fusion technique. , 2016, Analytica chimica acta.

[72]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[73]  Yi Xiong,et al.  DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features , 2019, Briefings Bioinform..

[74]  X. Chen,et al.  TTD: Therapeutic Target Database , 2002, Nucleic Acids Res..

[75]  Cheng Chen,et al.  SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting , 2020, Bioinform..

[76]  Kuo-Chen Chou,et al.  Nuc-PLoc: a new web-server for predicting protein subnuclear localization by fusing PseAA composition and PsePSSM. , 2007, Protein engineering, design & selection : PEDS.

[77]  K. V. Prema,et al.  Machine learning models for drug–target interactions: current knowledge and future directions , 2020 .

[78]  Bin Yu,et al.  Predicting drug-target interactions using Lasso with random forest based on evolutionary information and chemical structure. , 2019, Genomics.

[79]  Peng Chen,et al.  DrugRPE: Random projection ensemble approach to drug-target interaction prediction , 2017, Neurocomputing.

[80]  George Papadatos,et al.  The ChEMBL bioactivity database: an update , 2013, Nucleic Acids Res..

[81]  Jian-Yu Shi,et al.  Predicting drug-target interaction for new drugs using enhanced similarity measures and super-target clustering. , 2015, Methods.

[82]  MeiJian-Ping,et al.  Drug–target interaction prediction by learning from local information and neighbors , 2013 .

[83]  Frederic Blanchard,et al.  Imatinib Mesylate Exerts Anti-Proliferative Effects on Osteosarcoma Cells and Inhibits the Tumour Growth in Immunocompetent Murine Models , 2014, PloS one.