Drug-target interaction prediction using ensemble learning and dimensionality reduction.

Experimental prediction of drug-target interactions is expensive, time-consuming and tedious. Fortunately, computational methods help narrow down the search space for interaction candidates to be further examined via wet-lab techniques. Nowadays, the number of attributes/features for drugs and targets, as well as the amount of their interactions, are increasing, making these computational methods inefficient or occasionally prohibitive. This motivates us to derive a reduced feature set for prediction. In addition, since ensemble learning techniques are widely used to improve the classification performance, it is also worthwhile to design an ensemble learning framework to enhance the performance for drug-target interaction prediction. In this paper, we propose a framework for drug-target interaction prediction leveraging both feature dimensionality reduction and ensemble learning. First, we conducted feature subspacing to inject diversity into the classifier ensemble. Second, we applied three different dimensionality reduction methods to the subspaced features. Third, we trained homogeneous base learners with the reduced features and then aggregated their scores to derive the final predictions. For base learners, we selected two classifiers, namely Decision Tree and Kernel Ridge Regression, resulting in two variants of ensemble models, EnsemDT and EnsemKRR, respectively. In our experiments, we utilized AUC (Area under ROC Curve) as an evaluation metric. We compared our proposed methods with various state-of-the-art methods under 5-fold cross validation. Experimental results showed EnsemKRR achieving the highest AUC (94.3%) for predicting drug-target interactions. In addition, dimensionality reduction helped improve the performance of EnsemDT. In conclusion, our proposed methods produced significant improvements for drug-target interaction prediction.

[1]  Louiqa Raschid,et al.  Ieee/acm Transactions on Computational Biology and Bioinformatics 1 Network-based Drug-target Interaction Prediction with Probabilistic Soft Logic , 2022 .

[2]  David Brown,et al.  Pharmacodynamic Modeling of Anti-Cancer Activity of Tetraiodothyroacetic Acid in a Perfused Cell Culture System , 2011, PLoS Comput. Biol..

[3]  Ferda Nur Alpaslan,et al.  Modeling and predicting binding affinity of phencyclidine‐like compounds using machine learning methods , 2009 .

[4]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[5]  Stephen T. C. Wong,et al.  Toward better drug repositioning: prioritizing and integrating existing methods into efficient pipelines. , 2014, Drug discovery today.

[6]  Chee Keong Kwoh,et al.  Drug-target interaction prediction by learning from local information and neighbors , 2013, Bioinform..

[7]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[8]  Philip E. Bourne,et al.  Drug Discovery Using Chemical Systems Biology: Weak Inhibition of Multiple Kinases May Contribute to the Anti-Cancer Effect of Nelfinavir , 2011, PLoS Comput. Biol..

[9]  Chee Keong Kwoh,et al.  Drug-Target Interaction Prediction with Graph Regularized Matrix Factorization , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[10]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[11]  David S. Wishart,et al.  DrugBank 3.0: a comprehensive resource for ‘Omics’ research on drugs , 2010, Nucleic Acids Res..

[12]  Ali Masoudi-Nejad,et al.  Drug–target interaction prediction via chemogenomic space: learning-based methods , 2014, Expert opinion on drug metabolism & toxicology.

[13]  Yasuo Tabei,et al.  Identification of chemogenomic features from drug–target interaction networks using interpretable classifiers , 2012, Bioinform..

[14]  Chee Keong Kwoh,et al.  Drug-target interaction prediction via class imbalance-aware ensemble learning , 2016, BMC Bioinformatics.

[15]  L. Jiang,et al.  PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..

[16]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[17]  K. Chou,et al.  Predicting Drug-Target Interaction Networks Based on Functional Groups and Biological Features , 2010, PloS one.

[18]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[19]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[20]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[21]  S. Kung Kernel Methods and Machine Learning , 2014 .

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  Z. R. Li,et al.  Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..

[24]  Yoshihiro Yamanishi,et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces , 2008, ISMB.

[25]  Qingsong Xu,et al.  Rcpi: R/Bioconductor package to generate various descriptors of proteins, compounds and their interactions , 2015, Bioinform..

[26]  Damian Szklarczyk,et al.  STITCH 4: integration of protein–chemical interactions with user data , 2013, Nucleic Acids Res..

[27]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[28]  Hao Ding,et al.  Collaborative matrix factorization with multiple similarities for predicting drug-target interactions , 2013, KDD.

[29]  Xu-Feng Huang,et al.  Metabolic Parameters and Emotionality Are Little Affected in G-Protein Coupled Receptor 12 (Gpr12) Mutant Mice , 2012, PloS one.

[30]  Xiaomin Luo,et al.  TarFisDock: a web server for identifying drug targets with docking approach , 2006, Nucleic Acids Res..

[31]  Xing Chen,et al.  Drug-target interaction prediction by random walk on the heterogeneous network. , 2012, Molecular bioSystems.

[32]  Hua Yu,et al.  A Systematic Prediction of Multiple Drug-Target Interactions from Chemical, Genomic, and Pharmacological Data , 2012, PloS one.

[33]  Yoshihiro Yamanishi,et al.  Supervised prediction of drug–target interactions using bipartite local models , 2009, Bioinform..

[34]  Zhiyong Lu,et al.  A survey of current trends in computational drug repositioning , 2016, Briefings Bioinform..

[35]  E. Marchiori,et al.  Predicting Drug-Target Interactions for New Drug Compounds Using a Weighted Nearest Neighbor Profile , 2013, PloS one.

[36]  Robert D. Finn,et al.  The Pfam protein families database: towards a more sustainable future , 2015, Nucleic Acids Res..

[37]  Chuang Liu,et al.  Prediction of Drug-Target Interactions and Drug Repositioning via Network-Based Inference , 2012, PLoS Comput. Biol..

[38]  Mehmet Gönen,et al.  Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization , 2012, Bioinform..

[39]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..