Large-Scale Prediction of Drug-Target Interaction: a Data-Centric Review

The prediction of drug-target interactions (DTIs) is of extraordinary significance to modern drug discovery in terms of suggesting new drug candidates and repositioning old drugs. Despite technological advances, large-scale experimental determination of DTIs is still expensive and laborious. Effective and low-cost computational alternatives remain in strong need. Meanwhile, open-access resources have been rapidly growing with massive amount of bioactivity data becoming available, creating unprecedented opportunities for the development of novel in silico models for large-scale DTI prediction. In this work, we review the state-of-the-art computational approaches for identifying DTIs from a data-centric perspective: what the underlying data are and how they are utilized in each study. We also summarize popular public data resources and online tools for DTI prediction. It is found that various types of data were employed including properties of chemical structures, drug therapeutic effects and side effects, drug-target binding, drug-drug interactions, bioactivity data of drug molecules across multiple biological targets, and drug-induced gene expressions. More often, the heterogeneous data were integrated to offer better performance. However, challenges remain such as handling data imbalance, incorporating negative samples and quantitative bioactivity data, as well as maintaining cross-links among different data sources, which are essential for large-scale and automated information integration.

[1]  Chee Keong Kwoh,et al.  Drug-target interaction prediction via class imbalance-aware ensemble learning , 2016, BMC Bioinformatics.

[2]  Minoru Kanehisa,et al.  KEGG: new perspectives on genomes, pathways, diseases and drugs , 2016, Nucleic Acids Res..

[3]  Yanli Wang,et al.  PubChem BioAssay: 2017 update , 2016, Nucleic Acids Res..

[4]  George Hripcsak,et al.  Computational Drug Target Screening through Protein Interaction Profiles , 2016, Scientific Reports.

[5]  L. Siu,et al.  Approaches to modernize the combination drug development paradigm , 2016, Genome Medicine.

[6]  V. Narayan,et al.  Side effect profile similarities shared between antidepressants and immune-modulators reveal potential novel targets for treating major depressive disorders , 2016, BMC Pharmacology and Toxicology.

[7]  Antje Chang,et al.  BRENDA in 2017: new perspectives and new tools in BRENDA , 2016, Nucleic Acids Res..

[8]  Yi Pan,et al.  Predicting drug-target interaction using positive-unlabeled learning , 2016, Neurocomputing.

[9]  S. Bhanumathi,et al.  Generating Drug-Gene Association for Vibrio Cholerae using Ontological Profile Similarity , 2016 .

[10]  B. Schmidt,et al.  Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters , 2016, BMC Bioinformatics.

[11]  George Hripcsak,et al.  Leveraging 3D chemical similarity, target and phenotypic data in the identification of drug-protein and drug-adverse effect associations , 2016, Journal of Cheminformatics.

[12]  Yongdong Zhang,et al.  Drug-target interaction prediction: databases, web servers and computational models , 2016, Briefings Bioinform..

[13]  Thomas C. Wiegers,et al.  Advancing Exposure Science through Chemical Data Curation and Integration in the Comparative Toxicogenomics Database , 2016, Environmental health perspectives.

[14]  Zheng Yin,et al.  Improving chemical similarity ensemble approach in target prediction , 2016, Journal of Cheminformatics.

[15]  Bin Chen,et al.  Predicting drug target interactions using meta-path-based semantic network analysis , 2016, BMC Bioinformatics.

[16]  Arzucan Özgür,et al.  A comparative study of SMILES-based compound similarity functions for drug-target interaction prediction , 2016, BMC Bioinformatics.

[17]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[18]  Sahand Khakabimamaghani,et al.  Drug-target interaction prediction from PSSM based evolutionary information. , 2016, Journal of pharmacological and toxicological methods.

[19]  Tudor I. Oprea,et al.  ChemProt-3.0: a global chemical biology diseases mapping , 2016, Database J. Biol. Databases Curation.

[20]  Ivan G. Costa,et al.  A multiple kernel learning algorithm for drug-target interaction prediction , 2016, BMC Bioinformatics.

[21]  Anne Mai Wassermann,et al.  Public Domain HTS Fingerprints: Design and Evaluation of Compound Bioactivity Profiles from PubChem's Bioassay Repository , 2016, J. Chem. Inf. Model..

[22]  Pierre Baldi,et al.  Accurate and efficient target prediction using a potency-sensitive influence-relevance voter , 2015, Journal of Cheminformatics.

[23]  Yoshihiro Yamanishi,et al.  Predicting target proteins for drug candidate compounds based on drug-induced gene expression data in a chemical structure-independent manner , 2015, BMC Medical Genomics.

[24]  Feng Xu,et al.  Therapeutic target database update 2016: enriched resource for bench to clinical drug target and targeted pathway information , 2015, Nucleic Acids Res..

[25]  Yanli Wang,et al.  In Silico Study of Polypharmacology with Ligand-based Interaction Fingerprint , 2015 .

[26]  Peer Bork,et al.  The SIDER database of drugs and side effects , 2015, Nucleic Acids Res..

[27]  Michael K. Gilson,et al.  BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology , 2015, Nucleic Acids Res..

[28]  Joanna L. Sharman,et al.  The IUPHAR/BPS Guide to PHARMACOLOGY in 2016: towards curated quantitative interactions between 1300 protein targets and 6000 ligands , 2015, Nucleic Acids Res..

[29]  Tao Wang,et al.  Quantitative structure–activity relationship: promising advances in drug discovery platforms , 2015, Expert opinion on drug discovery.

[30]  Yan Li,et al.  An eigenvalue transformation technique for predicting drug-target interaction , 2015, Scientific Reports.

[31]  Yan Li,et al.  Large-scale Direct Targeting for Drug Repositioning and Discovery , 2015, Scientific Reports.

[32]  J. Guan,et al.  Improving compound-protein interaction prediction by building up highly credible negative samples. , 2015, Bioinformatics.

[33]  Xing-Ming Zhao,et al.  A Survey on the Computational Approaches to Identify Drug Targets in the Postgenomic Era , 2015, BioMed research international.

[34]  Cui Tao,et al.  Colorectal cancer drug target prediction using ontology-based inference and network analysis , 2015, Database J. Biol. Databases Curation.

[35]  Jie Li,et al.  PDB-wide collection of binding data: current status of the PDBbind database , 2015, Bioinform..

[36]  Fei Luo,et al.  Predicting target-ligand interactions using protein ligand-binding site and ligand substructures , 2015, BMC Systems Biology.

[37]  Yoshihiro Yamanishi,et al.  Benchmarking a Wide Range of Chemical Descriptors for Drug‐Target Interaction Prediction Using a Chemogenomic Approach , 2014, Molecular informatics.

[38]  Richard D. Smith,et al.  Recent improvements to Binding MOAD: a resource for protein–ligand binding affinities and structures , 2014, Nucleic Acids Res..

[39]  Siu-Ming Yiu,et al.  Predicting drug-target interaction for new drugs using enhanced similarity measures and super-target clustering1 , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[40]  Loris Nanni,et al.  A set of descriptors for identifying the protein-drug interaction in cellular networking. , 2014, Journal of theoretical biology.

[41]  Guillermo Palma,et al.  Drug-Target Interaction Prediction Using Semantic Similarity and Edge Partitioning , 2014, SEMWEB.

[42]  Hao Ding,et al.  Similarity-based machine learning methods for predicting drug-target interactions: a brief review , 2014, Briefings Bioinform..

[43]  Ali Masoudi-Nejad,et al.  Drug–target interaction prediction via chemogenomic space: learning-based methods , 2014, Expert opinion on drug metabolism & toxicology.

[44]  Xian Liu,et al.  In Silico target fishing: addressing a “Big Data” problem by ligand-based similarity rankings with data fusion , 2014, Journal of Cheminformatics.

[45]  Mathias Dunkel,et al.  SuperPred: update on drug classification and target prediction , 2014, Nucleic Acids Res..

[46]  Yoshihiro Yamanishi,et al.  DINIES: drug–target interaction network inference engine based on supervised analysis , 2014, Nucleic Acids Res..

[47]  Tapio Pahikkala,et al.  Toward more realistic drug^target interaction predictions , 2014 .

[48]  S. Jaeger,et al.  Causal Network Models for Predicting Compound Targets and Driving Pathways in Cancer , 2014, Journal of biomolecular screening.

[49]  Dachuan Zhang,et al.  MMDB and VAST+: tracking structural similarities between macromolecular complexes , 2013, Nucleic Acids Res..

[50]  Yong Wang,et al.  Computational Study of Drugs by Integrating Omics Data with Kernel Methods , 2013, Molecular informatics.

[51]  Chang Liu,et al.  Predicting Drug–Target Interactions Using Probabilistic Matrix Factorization , 2013, J. Chem. Inf. Model..

[52]  Damian Szklarczyk,et al.  STITCH 4: integration of protein–chemical interactions with user data , 2013, Nucleic Acids Res..

[53]  Hyunju Lee,et al.  Predicting Drug-Target Interactions Using Drug-Drug Interactions , 2013, PloS one.

[54]  Yang Jiang,et al.  Prediction of Drugs Target Groups Based on ChEBI Ontology , 2013, BioMed research international.

[55]  George Papadatos,et al.  The ChEMBL bioactivity database: an update , 2013, Nucleic Acids Res..

[56]  David S. Wishart,et al.  DrugBank 4.0: shedding new light on drug metabolism , 2013, Nucleic Acids Res..

[57]  Yanli Wang,et al.  PubChem BioAssay: 2014 update , 2013, Nucleic Acids Res..

[58]  Nobuyoshi Sugaya,et al.  Training Based on Ligand Efficiency Improves Prediction of Bioactivities of Ligands and Drug Target Proteins in a Machine Learning Approach , 2013, J. Chem. Inf. Model..

[59]  Louiqa Raschid,et al.  Drug-target interaction prediction for drug repurposing with probabilistic similarity logic , 2013, BioKDD '13.

[60]  Xiaofeng Liu,et al.  ChemMapper: a versatile web server for exploring pharmacology and chemical structure association based on molecular 3D similarity method , 2013, Bioinform..

[61]  Yuhao Wang,et al.  Predicting drug-target interactions using restricted Boltzmann machines , 2013, Bioinform..

[62]  Hailin Chen,et al.  A Semi-Supervised Method for Drug-Target Interaction Prediction with Consistency in Networks , 2013, PloS one.

[63]  J. Medina-Franco,et al.  Shifting from the single to the multitarget paradigm in drug discovery. , 2013, Drug discovery today.

[64]  Jie Li,et al.  Prediction of Polypharmacological Profiles of Drugs by the Integration of Chemical, Side Effect, and Therapeutic Space , 2013, J. Chem. Inf. Model..

[65]  Jie Shen,et al.  Adverse Drug Events: Database Construction and in Silico Prediction , 2013, J. Chem. Inf. Model..

[66]  Yoshihiro Yamanishi,et al.  Drug target prediction using adverse event report systems: a pharmacogenomic approach , 2012, Bioinform..

[67]  Yoshihiro Yamanishi,et al.  Relating drug–protein interaction network with drug side effects , 2012, Bioinform..

[68]  Bin Chen,et al.  Assessing Drug Target Association Using Semantic Linked Data , 2012, PLoS Comput. Biol..

[69]  Chuang Liu,et al.  Prediction of Drug-Target Interactions and Drug Repositioning via Network-Based Inference , 2012, PLoS Comput. Biol..

[70]  Michael J. Keiser,et al.  Large Scale Prediction and Testing of Drug Activity on Side-Effect Targets , 2012, Nature.

[71]  S. Bryant,et al.  Structure-Based Virtual Screening for Drug Discovery: a Problem-Centric Review , 2012, The AAPS Journal.

[72]  Anders Wallqvist,et al.  Exploring Polypharmacology Using a ROCS-Based Target Fishing Approach , 2012, J. Chem. Inf. Model..

[73]  Evan Bolton,et al.  PubChem's BioAssay Database , 2011, Nucleic Acids Res..

[74]  Philip E. Bourne,et al.  SuperTarget goes quantitative: update on drug–target interactions , 2011, Nucleic Acids Res..

[75]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[76]  Yanli Wang,et al.  Identifying Compound-Target Associations by Combining Bioactivity Profile Similarity Search and Public Databases Mining , 2011, J. Chem. Inf. Model..

[77]  Didier Rognan,et al.  Enhancing the Accuracy of Chemogenomic Models with a Three-Dimensional Binding Site Kernel , 2011, J. Chem. Inf. Model..

[78]  Ruili Huang,et al.  The NCGC Pharmaceutical Collection: A Comprehensive Resource of Clinically Approved Drugs Enabling Repurposing and Chemical Genomics , 2011, Science Translational Medicine.

[79]  Philip E. Bourne,et al.  PROMISCUOUS: a database for network-based drug-repositioning , 2010, Nucleic Acids Res..

[80]  Sheng-Yong Yang,et al.  Pharmacophore modeling and applications in drug discovery: challenges and recent advances. , 2010, Drug discovery today.

[81]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.

[82]  P. Bork,et al.  Drug Target Identification Using Side-Effect Similarity , 2008, Science.

[83]  Yoshihiro Yamanishi,et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces , 2008, ISMB.

[84]  Xueqin Xu,et al.  A novel RUNX2 missense mutation predicted to disrupt DNA binding causes cleidocranial dysplasia in a large Chinese family with hyperplastic nails , 2007, BMC Medical Genetics.

[85]  Martin Serrano,et al.  Nucleic Acids Research Advance Access published October 18, 2007 ChemBank: a small-molecule screening and , 2007 .

[86]  Robert B. Russell,et al.  SuperTarget and Matador: resources for exploring drug-target relationships , 2007, Nucleic Acids Res..

[87]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[88]  R. Shoemaker The NCI60 human tumour cell line anticancer drug screen , 2006, Nature Reviews Cancer.

[89]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.

[90]  Hiroshi Mamitsuka,et al.  A probabilistic model for mining implicit 'chemical compound-gene' relations from literature , 2005, ECCB/JBI.

[91]  T. Ashburn,et al.  Drug repositioning: identifying and developing new uses for existing drugs , 2004, Nature Reviews Drug Discovery.

[92]  B. Roth,et al.  The Multiplicity of Serotonin Receptors: Uselessly Diverse Molecules or an Embarrassment of Riches? , 2000 .

[93]  Zhiyong Lu,et al.  A survey of current trends in computational drug repositioning , 2016, Briefings Bioinform..

[94]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..