Similarity-Based Methods and Machine Learning Approaches for Target Prediction in Early Drug Discovery: Performance and Scope

Computational methods for predicting the macromolecular targets of drugs and drug-like compounds have evolved as a key technology in drug discovery. However, the established validation protocols leave several key questions regarding the performance and scope of methods unaddressed. For example, prediction success rates are commonly reported as averages over all compounds of a test set and do not consider the structural relationship between the individual test compounds and the training instances. In order to obtain a better understanding of the value of ligand-based methods for target prediction, we benchmarked a similarity-based method and a random forest based machine learning approach (both employing 2D molecular fingerprints) under three testing scenarios: a standard testing scenario with external data, a standard time-split scenario, and a scenario that is designed to most closely resemble real-world conditions. In addition, we deconvoluted the results based on the distances of the individual test molecules from the training data. We found that, surprisingly, the similarity-based approach generally outperformed the machine learning approach in all testing scenarios, even in cases where queries were structurally clearly distinct from the instances in the training (or reference) data, and despite a much higher coverage of the known target space.

[1]  Sereina Riniker,et al.  Open-source platform to benchmark fingerprints for ligand-based virtual screening , 2013, Journal of Cheminformatics.

[2]  Jijun Tang,et al.  Identification of drug-target interactions via multiple information integration , 2017, Inf. Sci..

[3]  Antonio Peón,et al.  Predicting the Reliability of Drug-target Interaction Predictions with Maximum Coverage of Target Space , 2017, Scientific Reports.

[4]  Nils-Ole Friedrich,et al.  Hit Dexter: A Machine‐Learning Model for the Prediction of Frequent Hitters , 2018, ChemMedChem.

[5]  Petra Schneider,et al.  Identifying the macromolecular targets of de novo-designed chemical entities through self-organizing map consensus , 2014, Proceedings of the National Academy of Sciences.

[6]  Yanjie Wei,et al.  DeepBindRG: a deep learning based method for estimating effective protein–ligand affinity , 2019, PeerJ.

[7]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.

[8]  Hugo Ceulemans,et al.  Large-scale comparison of machine learning methods for drug target prediction on ChEMBL , 2018, Chemical science.

[9]  Chee-Keong Kwoh,et al.  Computational prediction of drug-target interactions using chemogenomic approaches: an empirical survey , 2019, Briefings Bioinform..

[10]  Hanbi Lee,et al.  Comparison of Target Features for Predicting Drug-Target Interactions by Deep Neural Network Based on Large-Scale Drug-Induced Transcriptome Data , 2019, Pharmaceutics.

[11]  Adrià Cereto-Massagué,et al.  Tools for in silico target fishing. , 2015, Methods.

[12]  Bernardete Ribeiro,et al.  Deep Neural Network Architecture for Drug-Target Interaction Prediction , 2019, ICANN.

[13]  Antonio Lavecchia,et al.  In silico methods to address polypharmacology: current status, applications and future perspectives. , 2016, Drug discovery today.

[14]  Jean-Louis Reymond,et al.  Polypharmacology Browser PPB2: Target Prediction Combining Nearest Neighbors with Machine Learning , 2018, J. Chem. Inf. Model..

[15]  Lukasz Kurgan,et al.  Review and comparative assessment of similarity-based methods for prediction of drug-protein interactions in the druggable human proteome , 2018, Briefings Bioinform..

[16]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[17]  Aurélien Grosdidier,et al.  SwissTargetPrediction: a web server for target prediction of bioactive small molecules , 2014, Nucleic Acids Res..

[18]  Shuxing Zhang,et al.  Computational polypharmacology: a new paradigm for drug discovery , 2017, Expert opinion on drug discovery.

[19]  Antonino Lauria,et al.  Drugs Polypharmacology by In Silico Methods: New Opportunities in Drug Discovery. , 2016, Current pharmaceutical design.

[20]  Hao Ding,et al.  Similarity-based machine learning methods for predicting drug-target interactions: a brief review , 2014, Briefings Bioinform..

[21]  G. Schneider,et al.  Rethinking drug design in the artificial intelligence era , 2019, Nature Reviews Drug Discovery.

[22]  Hojung Nam,et al.  SELF-BLM: Prediction of drug-target interactions via self-training SVM , 2017, PloS one.

[23]  Isidro Cortes-Ciriano,et al.  Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects , 2015 .

[24]  Lirong Wang,et al.  TargetHunter: An In Silico Target Identification Tool for Predicting Therapeutic Potential of Small Organic Molecules Based on Chemogenomic Database , 2013, The AAPS Journal.

[25]  Diego di Bernardo,et al.  Mantra 2.0: an online collaborative resource for drug mode of action and repurposing by network analysis , 2014, Bioinform..

[26]  Andrew R. Leach,et al.  Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery , 2019, Journal of Cheminformatics.

[27]  Lukasz Kurgan,et al.  Survey of Similarity-based Prediction of Drug-protein Interactions. , 2018, Current medicinal chemistry.

[28]  Xin Geng,et al.  Binary relevance for multi-label learning: an overview , 2018, Frontiers of Computer Science.

[29]  Olivier Taboureau,et al.  Network‐based Approaches in Pharmacology , 2017, Molecular informatics.

[30]  Dongsup Kim,et al.  In-Silico Molecular Binding Prediction for Human Drug Targets Using Deep Neural Multi-Task Learning , 2019, Genes.

[31]  Xiaolin Cheng,et al.  STarFish: A Stacked Ensemble Target Fishing Approach and its Application to Natural Products , 2019, J. Chem. Inf. Model..

[32]  Robert Damoiseaux,et al.  3D Chemical Similarity Networks for Structure-Based Target Prediction and Scaffold Hopping. , 2016, ACS chemical biology.

[33]  Ewgenij Proschak,et al.  Polypharmacology by Design: A Medicinal Chemist's Perspective on Multitargeting Compounds. , 2018, Journal of medicinal chemistry.

[34]  M. Prunotto,et al.  Opportunities and challenges in phenotypic drug discovery: an industry perspective , 2017, Nature Reviews Drug Discovery.

[35]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[36]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[37]  Xiaofeng Liu,et al.  ChemMapper: a versatile web server for exploring pharmacology and chemical structure association based on molecular 3D similarity method , 2013, Bioinform..

[38]  Bin Chen,et al.  Predicting drug target interactions using meta-path-based semantic network analysis , 2016, BMC Bioinformatics.

[39]  Gisbert Schneider,et al.  Deep Learning in Drug Discovery , 2016, Molecular informatics.

[40]  Yi Xiong,et al.  DTI-CDF: a CDF model towards the prediction of DTIs based on hybrid features , 2019, bioRxiv.

[41]  Ya Chen,et al.  Validation strategies for target prediction methods , 2019, Briefings Bioinform..

[42]  G. Schneider,et al.  Active learning for computational chemogenomics. , 2017, Future medicinal chemistry.

[43]  Shuxing Zhang,et al.  Polypharmacology: drug discovery for the future , 2013, Expert review of clinical pharmacology.

[44]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[45]  Yi Xiong,et al.  DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features , 2019, Briefings Bioinform..

[46]  T. Rodrigues,et al.  Machine learning for target discovery in drug development. , 2019, Current opinion in chemical biology.

[47]  Bin Yu,et al.  Predicting drug-target interactions using Lasso with random forest based on evolutionary information and chemical structure. , 2019, Genomics.

[48]  Yanli Wang,et al.  Open-source chemogenomic data-driven algorithms for predicting drug-target interactions , 2019, Briefings Bioinform..

[49]  Kwong-Sak Leung,et al.  MolTarPred: A web tool for comprehensive target prediction with reliability estimation , 2019, Chemical biology & drug design.

[50]  Andrea Volkamer,et al.  Advances and Challenges in Computational Target Prediction , 2019, J. Chem. Inf. Model..

[51]  Mathias Dunkel,et al.  SuperPred: update on drug classification and target prediction , 2014, Nucleic Acids Res..

[52]  Michael J. Keiser,et al.  Large Scale Prediction and Testing of Drug Activity on Side-Effect Targets , 2012, Nature.