Drug-Target Interaction Prediction via an Ensemble of Weighted Nearest Neighbors with Interaction Recovery

Predicting drug-target interactions (DTI) via reliable computational methods is an effective and efficient way to mitigate the enormous costs and time of the drug discovery process. Structure-based drug similarities and sequence-based target protein similarities are the commonly used information for DTI prediction. Among numerous computational methods, neighborhoodbased chemogenomic approaches that leverage drug and target similarities to perform predictions directly are simple but promising ones. However, most existing similarity-based methods follow the transductive setting. These methods cannot directly generalize to unseen data because they should be re-built to predict the interactions for new arriving drugs, targets, or drug-target pairs. Besides, many similarity-based methods, especially neighborhood-based ones, cannot handle directly all three types of interaction prediction involving new drugs and/or new targets. Furthermore, a large amount of missing (undetected) interactions in current DTI datasets hinders most DTI prediction methods. To address these issues, we propose a new method denoted as Weighted k Nearest Neighbor with Interaction Recovery (WkNNIR). Not only can WkNNIR estimate interactions of any new drugs and/or new targets Bin Liu School of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece E-mail: binliu@csd.auth.gr Konstantinos Pliakos KU Leuven, Campus KULAK, Faculty of Medicine, Kortrijk, Belgium ITEC, imec research group at KU Leuven E-mail: konstantinos.pliakos@kuleuven.be Celine Vens KU Leuven, Campus KULAK, Faculty of Medicine, Kortrijk, Belgium ITEC, imec research group at KU Leuven E-mail: celine.vens@kuleuven.be Grigorios Tsoumakas School of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece E-mail: greg@csd.auth.gr ar X iv :2 01 2. 12 32 5v 1 [ cs .L G ] 2 2 D ec 2 02 0

[1]  Pierre Geurts,et al.  Classifying pairs with trees for supervised biological network inference† †Electronic supplementary information (ESI) available: Implementation and computational issues, supplementary performance curves, and illustration of interpretability of trees. See DOI: 10.1039/c5mb00174a Click here for additi , 2014, Molecular bioSystems.

[2]  Jianyu Shi,et al.  Predicting existing targets for new drugs base on strategies for missing interactions , 2016, BMC Bioinformatics.

[3]  Hao Ding,et al.  Similarity-based machine learning methods for predicting drug-target interactions: a brief review , 2014, Briefings Bioinform..

[4]  Siu-Ming Yiu,et al.  SRP: A concise non-parametric similarity-rank-based model for predicting drug-target interactions , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[5]  Chuang Liu,et al.  Prediction of Drug-Target Interactions and Drug Repositioning via Network-Based Inference , 2012, PLoS Comput. Biol..

[6]  Mehmet Gönen,et al.  Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization , 2012, Bioinform..

[7]  Chee-Keong Kwoh,et al.  Computational prediction of drug-target interactions using chemogenomic approaches: an empirical survey , 2019, Briefings Bioinform..

[8]  Jean-Philippe Vert,et al.  Protein-ligand interaction prediction: an improved chemogenomics approach , 2008, Bioinform..

[9]  Andrew R. Leach,et al.  ChEMBL: towards direct deposition of bioassay data , 2018, Nucleic Acids Res..

[10]  Yoshihiro Yamanishi,et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces , 2008, ISMB.

[11]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[12]  Jesse Davis,et al.  Learning from positive and unlabeled data: a survey , 2018, Machine Learning.

[13]  David S. Wishart,et al.  DrugBank 5.0: a major update to the DrugBank database for 2018 , 2017, Nucleic Acids Res..

[14]  Feng Liu,et al.  Predicting drug side effects by multi-label learning and ensemble learning , 2015, BMC Bioinformatics.

[15]  Louiqa Raschid,et al.  Ieee/acm Transactions on Computational Biology and Bioinformatics 1 Network-based Drug-target Interaction Prediction with Probabilistic Soft Logic , 2022 .

[16]  Bo Liao,et al.  Screening drug-target interactions with positive-unlabeled learning , 2017, Scientific Reports.

[17]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[18]  Jian-Yu Shi,et al.  Inferring Interactions between Novel Drugs and Novel Targets via Instance-Neighborhood-Based Models. , 2018, Current protein & peptide science.

[19]  Jian Peng,et al.  A Network Integration Approach for Drug-Target Interaction Prediction and Computational Drug Repositioning from Heterogeneous Information , 2017, RECOMB 2017.

[20]  Chang Liu,et al.  Predicting Drug–Target Interactions Using Probabilistic Matrix Factorization , 2013, J. Chem. Inf. Model..

[21]  Yanli Wang,et al.  Predicting drug-target interactions by dual-network integrated logistic matrix factorization , 2017, Scientific Reports.

[22]  Chee Keong Kwoh,et al.  Drug-Target Interaction Prediction with Graph Regularized Matrix Factorization , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[23]  Menglan Cai,et al.  Drug Target Prediction by Multi-View Low Rank Embedding , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  Alexandros Nanopoulos,et al.  Nearest neighbor regression in the presence of bad hubs , 2015, Knowl. Based Syst..

[25]  PeskaLadislav,et al.  Drug-target interaction prediction , 2017 .

[26]  Chen Lin,et al.  Learning to Predict Drug Target Interaction From Missing Not at Random Labels , 2019, IEEE Transactions on NanoBioscience.

[27]  M. Dickson,et al.  Key factors in the rising cost of new drug discovery and development , 2004, Nature Reviews Drug Discovery.

[28]  Vladimir B. Bajic,et al.  DDR: efficient computational method to predict drug–target interactions using graph mining and machine learning approaches , 2017, Bioinform..

[29]  Yi Zheng,et al.  Predicting Drug Targets from Heterogeneous Spaces using Anchor Graph Hashing and Ensemble Learning , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[30]  Chunyan Miao,et al.  Neighborhood Regularized Logistic Matrix Factorization for Drug-Target Interaction Prediction , 2016, PLoS Comput. Biol..

[31]  Yi Xiong,et al.  DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features , 2019, Briefings Bioinform..

[32]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[33]  Xin Gao,et al.  DTiGEMS+: drug–target interaction prediction using graph embedding, graph mining, and similarity-based techniques , 2020, Journal of Cheminformatics.

[34]  Charles C. Persinger,et al.  How to improve R&D productivity: the pharmaceutical industry's grand challenge , 2010, Nature Reviews Drug Discovery.

[35]  Konstantinos Pliakos,et al.  Drug-target interaction prediction with tree-ensemble learning and output space reconstruction , 2020, BMC Bioinformatics.

[36]  Grigorios Tsoumakas,et al.  Synthetic Oversampling of Multi-Label Data based on Local Label Distribution , 2019, ECML/PKDD.

[37]  Xing Chen,et al.  Drug-target interaction prediction by random walk on the heterogeneous network. , 2012, Molecular bioSystems.

[38]  Jian-Yu Shi,et al.  Predicting drug-target interaction for new drugs using enhanced similarity measures and super-target clustering. , 2015, Methods.

[39]  Grigorios Tsoumakas,et al.  Predicting Drug-Target Interactions With Multi-Label Classification and Label Partitioning , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[40]  Ladislav Peska,et al.  ALADIN: A New Approach for Drug-Target Interaction Prediction , 2017, ECML/PKDD.

[41]  Bernard De Baets,et al.  A Comparative Study of Pairwise Learning Methods Based on Kernel Ridge Regression , 2018, Neural Computation.

[42]  Russ B Altman,et al.  Machine learning in chemoinformatics and drug discovery. , 2018, Drug discovery today.

[43]  Yoshihiro Yamanishi,et al.  Supervised prediction of drug–target interactions using bipartite local models , 2009, Bioinform..

[44]  E. Marchiori,et al.  Predicting Drug-Target Interactions for New Drug Compounds Using a Weighted Nearest Neighbor Profile , 2013, PloS one.

[45]  Aarno Vuola,et al.  Predicting Drug-Target Interactions , 2016 .

[46]  Konstantinos Pliakos,et al.  Network inference with ensembles of bi-clustering trees , 2019, BMC Bioinformatics.

[47]  Chee Keong Kwoh,et al.  Drug-target interaction prediction by learning from local information and neighbors , 2013, Bioinform..

[48]  Pierre Geurts,et al.  Global multi-output decision trees for interaction prediction , 2018, Machine Learning.

[49]  Jijun Tang,et al.  Identification of Drug-Target Interactions via Dual Laplacian Regularized Least Squares with Multiple Kernel Fusion , 2020, Knowl. Based Syst..

[50]  Salvatore Alaimo,et al.  Drug–target interaction prediction through domain-tuned network-based inference , 2013, Bioinform..

[51]  Xiangrong Liu,et al.  Machine Learning for Drug-Target Interaction Prediction , 2018, Molecules.

[52]  Hui Zhang,et al.  Improved Prediction of Drug-Target Interactions Using Self-Paced Learning with Collaborative Matrix Factorization , 2019, J. Chem. Inf. Model..

[53]  Minoru Kanehisa,et al.  KEGG: new perspectives on genomes, pathways, diseases and drugs , 2016, Nucleic Acids Res..

[54]  Jing Li,et al.  Drug Target Predictions Based on Heterogeneous Graph Inference , 2012, Pacific Symposium on Biocomputing.

[55]  S. Opella,et al.  Structure determination of membrane proteins by nuclear magnetic resonance spectroscopy. , 2013, Annual review of analytical chemistry.

[56]  M. Kanehisa,et al.  Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. , 2003, Journal of the American Chemical Society.

[57]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[58]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[59]  Sameh K. Mohamed,et al.  Discovering protein drug targets using knowledge graph embeddings , 2019, Bioinform..

[60]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[61]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[62]  Lukasz Kurgan,et al.  Survey of Similarity-based Prediction of Drug-protein Interactions. , 2018, Current medicinal chemistry.

[63]  Panos Kalnis,et al.  DASPfind: new efficient method to predict drug–target interactions , 2016, Journal of Cheminformatics.

[64]  Hao Ding,et al.  Collaborative matrix factorization with multiple similarities for predicting drug-target interactions , 2013, KDD.