Large-scale prediction of drug-target interactions using protein sequences and drug topological structures.

The identification of interactions between drugs and target proteins plays a key role in the process of genomic drug discovery. It is both consuming and costly to determine drug-target interactions by experiments alone. Therefore, there is an urgent need to develop new in silico prediction approaches capable of identifying these potential drug-target interactions in a timely manner. In this article, we aim at extending current structure-activity relationship (SAR) methodology to fulfill such requirements. In some sense, a drug-target interaction can be regarded as an event or property triggered by many influence factors from drugs and target proteins. Thus, each interaction pair can be represented theoretically by using these factors which are based on the structural and physicochemical properties simultaneously from drugs and proteins. To realize this, drug molecules are encoded with MACCS substructure fingerings representing existence of certain functional groups or fragments; and proteins are encoded with some biochemical and physicochemical properties. Four classes of drug-target interaction networks in humans involving enzymes, ion channels, G-protein-coupled receptors (GPCRs) and nuclear receptors, are independently used for establishing predictive models with support vector machines (SVMs). The SVM models gave prediction accuracy of 90.31%, 88.91%, 84.68% and 83.74% for four datasets, respectively. In conclusion, the results demonstrate the ability of our proposed method to predict the drug-target interactions, and show a general compatibility between the new scheme and current SAR methodology. They open the way to a host of new investigations on the diversity analysis and prediction of drug-target interactions.

[1]  Jean-Loup Faulon,et al.  Genome scale enzyme–metabolite and drug–target interaction predictions using the signature molecular descriptor , 2008 .

[2]  Shu‐wen W. Chen,et al.  Identification of functionally important residues in proteins using comparative models. , 2004, Current medicinal chemistry.

[3]  Humberto González-Díaz,et al.  2D MI-DRAGON: a new predictor for protein-ligands interactions and theoretic-experimental studies of US FDA drug-target network, oxoisoaporphine inhibitors for MAO-A and human parasite proteins. , 2011, European journal of medicinal chemistry.

[4]  Kuo-Chen Chou,et al.  An in-depth analysis of the biological functional studies based on the NMR M2 channel structure of influenza A virus. , 2008, Biochemical and biophysical research communications.

[5]  J. Chou,et al.  Mechanism of drug inhibition and drug resistance of influenza A M2 channel , 2009, Proceedings of the National Academy of Sciences.

[6]  David S. Wishart,et al.  DrugBank 3.0: a comprehensive resource for ‘Omics’ research on drugs , 2010, Nucleic Acids Res..

[7]  P. Bork,et al.  A side effect resource to capture phenotypic effects of drugs , 2010, Molecular systems biology.

[8]  Alan C. Cheng,et al.  Structure-Based Identification of Small Molecule Binding Sites Using a Free Energy Model , 2006, J. Chem. Inf. Model..

[9]  Paola Brun,et al.  Using the TOPS-MODE approach to fit multi-target QSAR models for tyrosine kinases inhibitors. , 2011, European journal of medicinal chemistry.

[10]  Yoshihiro Yamanishi,et al.  Supervised prediction of drug–target interactions using bipartite local models , 2009, Bioinform..

[11]  Satoshi Niijima,et al.  GLIDA: GPCR—ligand database for chemical genomics drug discovery—database and tools update , 2007, Nucleic Acids Res..

[12]  X. Chen,et al.  TTD: Therapeutic Target Database , 2002, Nucleic Acids Res..

[13]  Didier Rognan,et al.  Structure‐Based Approaches to Target Fishing and Ligand Profiling , 2010, Molecular informatics.

[14]  Dong-Sheng Cao,et al.  In silico classification of human maximum recommended daily dose based on modified random forest and substructure fingerprint. , 2011, Analytica chimica acta.

[15]  Y. Z. Chen,et al.  Predicting functional family of novel enzymes irrespective of sequence similarity: a statistical learning approach , 2004, Nucleic acids research.

[16]  Paavo Honkakoski,et al.  Inhibition and induction of human cytochrome P450 enzymes: current status , 2008, Archives of Toxicology.

[17]  Robert B. Russell,et al.  SuperTarget and Matador: resources for exploring drug-target relationships , 2007, Nucleic Acids Res..

[18]  Susumu Goto,et al.  The KEGG databases at GenomeNet , 2002, Nucleic Acids Res..

[19]  Jie Shen,et al.  Estimation of ADME Properties with Substructure Pattern Recognition , 2010, J. Chem. Inf. Model..

[20]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[21]  Z. Deng,et al.  Bridging chemical and biological space: "target fishing" using 2D and 3D molecular descriptors. , 2006, Journal of medicinal chemistry.

[22]  P. Bork,et al.  Network Neighbors of Drug Targets Contribute to Drug Side-Effect Similarity , 2011, PloS one.

[23]  Kuo-Chen Chou,et al.  Investigation into adamantane-based M2 inhibitors with FB-QSAR. , 2009, Medicinal chemistry (Shariqah (United Arab Emirates)).

[24]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[25]  Daniel R. Caffrey,et al.  Structure-based maximal affinity model predicts small-molecule druggability , 2007, Nature Biotechnology.

[26]  Xiaobo Zhou,et al.  Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces , 2010, BMC Systems Biology.

[27]  Yoshihiro Yamanishi,et al.  Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework , 2010, Bioinform..

[28]  D S Goodsell,et al.  Automated docking of flexible ligands: Applications of autodock , 1996, Journal of molecular recognition : JMR.

[29]  Yuan Guo,et al.  Structural basis for distinct ligand-binding and targeting properties of the receptors DC-SIGN and DC-SIGNR , 2004, Nature Structural &Molecular Biology.

[30]  Y. Z. Chen,et al.  Prediction of transporter family from protein sequence by support vector machine approach , 2005, Proteins.

[31]  Weiming Yu,et al.  Predicting drug‐target interactions based on an improved semi‐supervised learning approach , 2011 .

[32]  P Corvol,et al.  An insertion/deletion polymorphism in the angiotensin I-converting enzyme gene accounting for half the variance of serum enzyme levels. , 1990, The Journal of clinical investigation.

[33]  M D Barratt,et al.  The computational prediction of toxicity. , 2001, Current opinion in chemical biology.

[34]  Peter L. Freddolino,et al.  Prediction of structure and function of G protein-coupled receptors , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Yizeng Liang,et al.  Exploring nonlinear relationships in chemical data using kernel-based methods , 2011 .

[36]  A. Bender,et al.  In silico target fishing: Predicting biological targets from chemical structure , 2006 .

[37]  Diana Conte Camerino,et al.  Ion channel pharmacology , 2007, Neurotherapeutics.

[38]  K. Chou Structural bioinformatics and its impact to biomedical science. , 2004, Current medicinal chemistry.

[39]  A. Y. Lu,et al.  Inhibition and Induction of Cytochrome P450 and the Clinical Implications , 1998, Clinical pharmacokinetics.

[40]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[41]  B. Fan,et al.  Molecular similarity and diversity in chemoinformatics: From theory to applications , 2006, Molecular Diversity.

[42]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[43]  Tom L. Blundell,et al.  Keynote review: Structural biology and drug discovery , 2005 .

[44]  Yoshihiro Yamanishi,et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces , 2008, ISMB.

[45]  Arthur Christopoulos,et al.  Critical Role for the Second Extracellular Loop in the Binding of Both Orthosteric and Allosteric G Protein-coupled Receptor Ligands* , 2007, Journal of Biological Chemistry.

[46]  John B. O. Mitchell The Relationship between the Sequence Identities of Alpha Helical Proteins in the PDB and the Molecular Similarities of Their Ligands , 2001, J. Chem. Inf. Comput. Sci..

[47]  A. Hopkins Network pharmacology: the next paradigm in drug discovery. , 2008, Nature chemical biology.

[48]  Kuo-Chen Chou,et al.  Prediction of G-protein-coupled receptor classes. , 2005, Journal of proteome research.

[49]  V. Laudet,et al.  The nuclear receptor superfamily , 2003, Journal of Cell Science.

[50]  Humberto González-Díaz,et al.  Using entropy of drug and protein graphs to predict FDA drug-target network: theoretic-experimental study of MAO inhibitors and hemoglobin peptides from Fasciola hepatica. , 2011, European journal of medicinal chemistry.

[51]  Wolfgang Jahnke and Daniel A. Erlanson Fragment-based approaches in drug discovery , 2013 .

[52]  Ulf Norinder,et al.  Prediction of ADMET Properties , 2006, ChemMedChem.

[53]  I. Muchnik,et al.  Prediction of protein folding class using global description of amino acid sequence. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Yasushi Okuno,et al.  GLIDA: GPCR-ligand database for chemical genomic drug discovery , 2005, Nucleic Acids Res..

[55]  Hongdong Li,et al.  Toward better QSAR/QSPR modeling: simultaneous outlier detection and variable selection using distribution of model features , 2011, J. Comput. Aided Mol. Des..

[56]  J. Hickman,et al.  Drug-target interactions: only the first step in the commitment to a programmed cell death? , 1991, British Journal of Cancer.

[57]  P. Bork,et al.  Drug Target Identification Using Side-Effect Similarity , 2008, Science.

[58]  J. Bockaert,et al.  Molecular tinkering of G protein‐coupled receptors: an evolutionary success , 1999, The EMBO journal.

[59]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[60]  Dong-Sheng Cao,et al.  Prediction of aqueous solubility of druglike organic compounds using partial least squares, back‐propagation network and support vector machine , 2010 .

[61]  P. Bork,et al.  Large‐scale prediction of drug–target relationships , 2008, FEBS letters.

[62]  John C. Dearden,et al.  In silico prediction of drug toxicity , 2003, J. Comput. Aided Mol. Des..

[63]  Youngkyu Park,et al.  Phenobarbital Induction Mediated by a Distal CYP2B2 Sequence in Rat Liver Transiently Transfected in Situ * , 1996, The Journal of Biological Chemistry.

[64]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[65]  Neil Swainston,et al.  Towards a genome-scale kinetic model of cellular metabolism , 2010, BMC Systems Biology.

[66]  Min Wang,et al.  Prediction of antibacterial compounds by machine learning approaches , 2009, J. Comput. Chem..

[67]  J. Pascussi,et al.  Dexamethasone induces pregnane X receptor and retinoid X receptor-alpha expression in human hepatocytes: synergistic increase of CYP3A4 induction by pregnane X receptor activators. , 2000, Molecular pharmacology.

[68]  A. Barabasi,et al.  Drug—target network , 2007, Nature Biotechnology.

[69]  Gajendra P. S. Raghava,et al.  GPCRpred: an SVM-based method for prediction of families and subfamilies of G-protein coupled receptors , 2004, Nucleic Acids Res..

[70]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[71]  Lu Huang,et al.  Update of TTD: Therapeutic Target Database , 2009, Nucleic Acids Res..

[72]  E. Chautard,et al.  Interaction networks: from protein functions to drug discovery. A review. , 2009, Pathologie-biologie.

[73]  X. Chen,et al.  SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence , 2003, Nucleic Acids Res..

[74]  Kuo-Chen Chou,et al.  Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes , 2005, Bioinform..

[75]  R. Solé,et al.  The topology of drug-target interaction networks: implicit dependence on drug properties and target families. , 2009, Molecular bioSystems.

[76]  Sarath Chandra Janga,et al.  Structure and organization of drug-target networks: insights from genomic approaches for drug discovery. , 2009, Molecular bioSystems.

[77]  Pedro Alexandrino Fernandes,et al.  Protein–ligand docking: Current status and future challenges , 2006, Proteins.

[78]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[79]  Pierre Acklin,et al.  Similarity Metrics for Ligands Reflecting the Similarity of the Target Proteins , 2003, J. Chem. Inf. Comput. Sci..

[80]  V. Laudet,et al.  Ligand binding and nuclear receptor evolution , 2000, BioEssays : news and reviews in molecular, cellular and developmental biology.

[81]  H. van de Waterbeemd,et al.  ADMET in silico modelling: towards prediction paradise? , 2003, Nature reviews. Drug discovery.

[82]  J. Chou,et al.  Structure and mechanism of the M2 proton channel of influenza A virus , 2008, Nature.