Similarity-based machine learning methods for predicting drug-target interactions: a brief review

Computationally predicting drug-target interactions is useful to select possible drug (or target) candidates for further biochemical verification. We focus on machine learning-based approaches, particularly similarity-based methods that use drug and target similarities, which show relationships among drugs and those among targets, respectively. These two similarities represent two emerging concepts, the chemical space and the genomic space. Typically, the methods combine these two types of similarities to generate models for predicting new drug-target interactions. This process is also closely related to a lot of work in pharmacogenomics or chemical biology that attempt to understand the relationships between the chemical and genomic spaces. This background makes the similarity-based approaches attractive and promising. This article reviews the similarity-based machine learning methods for predicting drug-target interactions, which are state-of-the-art and have aroused great interest in bioinformatics. We describe each of these methods briefly, and empirically compare these methods under a uniform experimental setting to explore their advantages and limitations.

[1]  Yoshihiro Yamanishi,et al.  Supervised prediction of drug–target interactions using bipartite local models , 2009, Bioinform..

[2]  Roded Sharan,et al.  Combining Drug and Gene Similarity Measures for Drug-Target Elucidation , 2011, J. Comput. Biol..

[3]  Hiroki Kobayashi,et al.  Integrating Statistical Predictions and Experimental Verifications for Enhancing Protein-Chemical Interaction Predictions in Virtual Screening , 2009, PLoS Comput. Biol..

[4]  Philippe Sanseau,et al.  Editorial: Computational methods for drug repurposing , 2011, Briefings Bioinform..

[5]  A. Hopkins,et al.  The druggable genome , 2002, Nature Reviews Drug Discovery.

[6]  Catherine Brooksbank,et al.  The European Bioinformatics Institute’s data resources , 2009, Nucleic Acids Res..

[7]  Peter B. McGarvey,et al.  Infrastructure for the life sciences: design and implementation of the UniProt website , 2009, BMC Bioinformatics.

[8]  B. Stockwell Chemical genetics: ligand-based discovery of gene function , 2000, Nature Reviews Genetics.

[9]  Joel Dudley,et al.  Exploiting drug-disease relationships for computational drug repositioning , 2011, Briefings Bioinform..

[10]  Andreas Bender,et al.  From in silico target prediction to multi-target drug design: current databases, methods and applications. , 2011, Journal of proteomics.

[11]  Jean-Philippe Vert,et al.  Protein-ligand interaction prediction: an improved chemogenomics approach , 2008, Bioinform..

[12]  Xiaobo Zhou,et al.  Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces , 2010, BMC Systems Biology.

[13]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[14]  Jean-Philippe Vert,et al.  Supervised reconstruction of biological networks with local models , 2007, ISMB/ECCB.

[15]  J. Ballesteros,et al.  G protein-coupled receptor drug discovery: implications from the crystal structure of rhodopsin. , 2001, Current opinion in drug discovery & development.

[16]  John B. O. Mitchell The Relationship between the Sequence Identities of Alpha Helical Proteins in the PDB and the Molecular Similarities of Their Ligands , 2001, J. Chem. Inf. Comput. Sci..

[17]  D. Bojanic,et al.  Keynote review: in vitro safety pharmacology profiling: an essential tool for successful drug development. , 2005, Drug discovery today.

[18]  Thomas Lengauer,et al.  A fast flexible docking method using an incremental construction algorithm. , 1996, Journal of molecular biology.

[19]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.

[20]  Ruth Nussinov,et al.  Principles of docking: An overview of search algorithms and a guide to scoring functions , 2002, Proteins.

[21]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[22]  Damian Szklarczyk,et al.  STITCH 3: zooming in on protein–chemical interactions , 2011, Nucleic Acids Res..

[23]  Sanjay Joshua Swamidass,et al.  Mining small-molecule screens to repurpose drugs , 2011, Briefings Bioinform..

[24]  P. Bork,et al.  A side effect resource to capture phenotypic effects of drugs , 2010, Molecular systems biology.

[25]  Yoshihiro Yamanishi,et al.  Predicting drug side-effect profiles: a chemical fragment-based approach , 2011, BMC Bioinformatics.

[26]  M. Moran,et al.  Large-scale mapping of human protein–protein interactions by mass spectrometry , 2007, Molecular systems biology.

[27]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[28]  Brian K. Shoichet,et al.  Molecular docking using shape descriptors , 1992 .

[29]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[30]  Robert B. Russell,et al.  SuperTarget and Matador: resources for exploring drug-target relationships , 2007, Nucleic Acids Res..

[31]  Yoshihiro Yamanishi,et al.  Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework , 2010, Bioinform..

[32]  Daniel R. Caffrey,et al.  Structure-based maximal affinity model predicts small-molecule druggability , 2007, Nature Biotechnology.

[33]  Chuang Liu,et al.  Prediction of Drug-Target Interactions and Drug Repositioning via Network-Based Inference , 2012, PLoS Comput. Biol..

[34]  Mehmet Gönen,et al.  Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization , 2012, Bioinform..

[35]  Charles J. Manly,et al.  The impact of informatics and computational chemistry on synthesis and screening. , 2001, Drug discovery today.

[36]  P. Bork,et al.  Large‐scale prediction of drug–target relationships , 2008, FEBS letters.

[37]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2011 update , 2010, Nucleic Acids Res..

[38]  Pierre Acklin,et al.  Similarity Metrics for Ligands Reflecting the Similarity of the Target Proteins , 2003, J. Chem. Inf. Comput. Sci..

[39]  P. Bork,et al.  Drug Target Identification Using Side-Effect Similarity , 2008, Science.

[40]  S. Haggarty,et al.  Multidimensional chemical genetic analysis of diversity-oriented synthesis-derived deacetylase inhibitors using cell-based assays. , 2003, Chemistry & biology.

[41]  Laetitia Martin-Chanas,et al.  Identify drug repurposing candidates by mining the Protein Data Bank , 2011, Briefings Bioinform..

[42]  M. Kanehisa,et al.  Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. , 2003, Journal of the American Chemical Society.

[43]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[44]  C. Dobson Chemical space and biology , 2004, Nature.

[45]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.

[46]  K. Tsuda,et al.  Mining Significant Substructure Pairs for Interpreting Polypharmacology in Drug-Target Network , 2011, PloS one.

[47]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[48]  S Garattini,et al.  Are me-too drugs justified? , 1997, Journal of nephrology.

[49]  Yoshihiro Yamanishi,et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces , 2008, ISMB.

[50]  Yasubumi Sakakibara,et al.  Statistical prediction of protein-chemical interactions based on chemical structure and mass spectrometry data , 2007, Bioinform..

[51]  Xin Chen,et al.  DCDB: Drug combination database , 2010, Bioinform..

[52]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[53]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[54]  T. Klabunde Chemogenomic approaches to drug discovery: similar receptors bind similar ligands , 2007, British journal of pharmacology.

[55]  Michael J. Keiser,et al.  Large Scale Prediction and Testing of Drug Activity on Side-Effect Targets , 2012, Nature.

[56]  J. Irwin,et al.  Lead discovery using molecular docking. , 2002, Current opinion in chemical biology.

[57]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[58]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[59]  Stuart L. Schreiber,et al.  Dissecting glucose signalling with diversity-oriented synthesis and small-molecule microarrays , 2002, Nature.

[60]  Antje Chang,et al.  New Developments , 2003 .

[61]  H. Yabuuchi,et al.  Analysis of multiple compound–protein interactions reveals novel bioactive molecules , 2011, Molecular systems biology.

[62]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2008 update , 2008, Nucleic Acids Res..

[63]  Jean-Philippe Vert,et al.  SIRENE: supervised inference of regulatory networks , 2008, ECCB.

[64]  P. Bork,et al.  Drug discovery in the age of systems biology: the rise of computational approaches for data integration. , 2012, Current opinion in biotechnology.

[65]  Gerhard Hessler,et al.  Drug Design Strategies for Targeting G‐Protein‐Coupled Receptors , 2002, Chembiochem : a European journal of chemical biology.

[66]  Hiroshi Mamitsuka,et al.  A probabilistic model for mining implicit 'chemical compound-gene' relations from literature , 2005, ECCB/JBI.

[67]  Andrew L. Hopkins,et al.  Drug discovery: Predicting promiscuity , 2009, Nature.

[68]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..