Using compound similarity and functional domain composition for prediction of drug-target interaction networks.

Study of interactions between drugs and target proteins is an essential step in genomic drug discovery. It is very hard to determine the compound-protein interactions or drug-target interactions by experiment alone. As supplementary, effective prediction model using machine learning or data mining methods can provide much help. In this study, a prediction method based on Nearest Neighbor Algorithm and a novel metric, which was obtained by combining compound similarity and functional domain composition, was proposed. The target proteins were divided into the following groups: enzymes, ion channels, G protein-coupled receptors, and nuclear receptors. As a result, four predictors with the optimal parameters were established. The overall prediction accuracies, evaluated by jackknife cross-validation test, for four groups of target proteins are 90.23%, 94.74%, 97.80%, and 97.51%, respectively, indicating that compound similarity and functional domain composition are very effective to predict drug-target interaction networks.

[1]  C. Bennett,et al.  Efficiency of antisense oligonucleotide drug discovery. , 2002, Antisense & nucleic acid drug development.

[2]  Yanzhi Guo,et al.  Using the augmented Chou's pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach. , 2009, Journal of theoretical biology.

[3]  Ziliang Qian,et al.  Prediction of peptidase category based on functional domain composition. , 2008, Journal of proteome research.

[4]  Parviz Abdolmaleki,et al.  gamma-Turn types prediction in proteins using the support vector machines. , 2007, Journal of theoretical biology.

[5]  K. Chou,et al.  Support vector machines for prediction of protein subcellular location. , 2000, Molecular cell biology research communications : MCBRC.

[6]  Xiaoyong Zou,et al.  Using pseudo-amino acid composition and support vector machine to predict protein structural class. , 2006, Journal of theoretical biology.

[7]  Yu-Dong Cai,et al.  Support Vector Machines for predicting protein structural class , 2001, BMC Bioinformatics.

[8]  S. Haggarty,et al.  Multidimensional chemical genetic analysis of diversity-oriented synthesis-derived deacetylase inhibitors using cell-based assays. , 2003, Chemistry & biology.

[9]  Fang Liu,et al.  Functional association between influenza A (H1N1) virus and human. , 2009, Biochemical and biophysical research communications.

[10]  Sándor Pongor,et al.  The SBASE protein domain library, Release 4.0: a collection of annotated protein sequence segments , 1993, Nucleic Acids Res..

[11]  Kuo-Chen Chou,et al.  Prediction of G-protein-coupled receptor classes. , 2005, Journal of proteome research.

[12]  Kuo-Chen Chou,et al.  Using functional domain composition to predict enzyme family classes. , 2005, Journal of proteome research.

[13]  Jianding Qiu,et al.  Prediction of G-protein-coupled receptor classes based on the concept of Chou's pseudo amino acid composition: an approach from discrete wavelet transform. , 2009, Analytical biochemistry.

[14]  Tao Huang,et al.  Prediction of Pharmacological and Xenobiotic Responses to Drugs Based on Time Course Gene Expression Profiles , 2009, PloS one.

[15]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[16]  C. DeLisi,et al.  Prediction of protein structural class from the amino acid sequence , 1986, Biopolymers.

[17]  J. Chou,et al.  The structure of phospholamban pentamer reveals a channel-like architecture in membranes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Peilin Jia,et al.  Prediction of subcellular protein localization based on functional domain composition. , 2007, Biochemical and biophysical research communications.

[19]  Yu-Dong Cai,et al.  Predicting protease types by hybridizing gene ontology and pseudo amino acid composition , 2006, Proteins.

[20]  Malcolm J. McGregor,et al.  Clustering of Large Databases of Compounds: Using the MDL "Keys" as Structural Descriptors , 1997, J. Chem. Inf. Comput. Sci..

[21]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[22]  Thierry Denoeux,et al.  A k-nearest neighbor classification rule based on Dempster-Shafer theory , 1995, IEEE Trans. Syst. Man Cybern..

[23]  J. Chou,et al.  Structure and mechanism of the M2 proton channel of influenza A virus , 2008, Nature.

[24]  K. Chou,et al.  Recent progress in protein subcellular location prediction. , 2007, Analytical biochemistry.

[25]  Thomas Lengauer,et al.  A fast flexible docking method using an incremental construction algorithm. , 1996, Journal of molecular biology.

[26]  Guo-Ping Zhou,et al.  An Intriguing Controversy over Protein Structural Class Prediction , 1998, Journal of protein chemistry.

[27]  Guo-Zheng Li,et al.  Using AdaBoost for the prediction of subcellular location of prokaryotic and eukaryotic proteins , 2008, Molecular Diversity.

[28]  Kuo-Chen Chou,et al.  Nearest neighbour algorithm for predicting protein subcellular location by combining functional domain composition and pseudo-amino acid composition. , 2003, Biochemical and biophysical research communications.

[29]  S Salzberg,et al.  Predicting protein secondary structure with a nearest-neighbor algorithm. , 1992, Journal of molecular biology.

[30]  Kuo-Chen Chou,et al.  Predicting enzyme subclass by functional domain composition and pseudo amino acid composition. , 2005, Journal of proteome research.

[31]  Jonathan Knowles,et al.  A guide to drug discovery: Target selection in drug discovery , 2003, Nature Reviews Drug Discovery.

[32]  Robert B. Russell,et al.  SuperTarget and Matador: resources for exploring drug-target relationships , 2007, Nucleic Acids Res..

[33]  Kuo-Chen Chou,et al.  Predicting enzyme family classes by hybridizing gene product composition and pseudo-amino acid composition. , 2005, Journal of theoretical biology.

[34]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[35]  Stuart L. Schreiber,et al.  Dissecting glucose signalling with diversity-oriented synthesis and small-molecule microarrays , 2002, Nature.

[36]  Masaaki Muraki,et al.  An encoding system for a group contribution method , 1992, J. Chem. Inf. Comput. Sci..

[37]  Antje Chang,et al.  BRENDA , the enzyme database : updates and major new developments , 2003 .

[38]  Zu-Guo Yu,et al.  Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation. , 2009 .

[39]  Yu Shyr,et al.  The prediction of interferon treatment effects based on time series microarray gene expression profiles , 2008, Journal of Translational Medicine.

[40]  M. Kanehisa,et al.  Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. , 2003, Journal of the American Chemical Society.

[41]  Kuo-Chen Chou,et al.  Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition , 2010, BMC Bioinformatics.

[42]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[43]  K. Chou,et al.  Using neural networks for prediction of subcellular location of prokaryotic and eukaryotic proteins. , 2000, Molecular cell biology research communications : MCBRC.

[44]  Hiroshi Mamitsuka,et al.  A probabilistic model for mining implicit 'chemical compound-gene' relations from literature , 2005, ECCB/JBI.

[45]  Y. Z. Chen,et al.  Protein function classification via support vector machine approach. , 2003, Mathematical biosciences.

[46]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[47]  Lei Chen,et al.  Identifying protein complexes using hybrid properties. , 2009, Journal of proteome research.

[48]  K. Chou,et al.  Support vector machines for predicting membrane protein types by using functional domain composition. , 2003, Biophysical journal.

[49]  Yoshihiro Yamanishi,et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces , 2008, ISMB.

[50]  Zheng-Zhi Wang,et al.  Classification of G-protein coupled receptors at four levels. , 2006, Protein engineering, design & selection : PEDS.

[51]  Daniel R. Caffrey,et al.  Structure-based maximal affinity model predicts small-molecule druggability , 2007, Nature Biotechnology.

[52]  J. Chou,et al.  Mechanism of drug inhibition and drug resistance of influenza A M2 channel , 2009, Proceedings of the National Academy of Sciences.

[53]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[54]  Kuo-Chen Chou,et al.  Coupling interaction between thromboxane A2 receptor and alpha-13 subunit of guanine nucleotide-binding protein. , 2005, Journal of proteome research.

[55]  K. Chou,et al.  Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms , 2008, Nature Protocols.

[56]  Kuo-Chen Chou Insights from modeling three-dimensional structures of the human potassium and sodium channels. , 2004, Journal of proteome research.

[57]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[58]  K. Chou,et al.  Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular Location* , 2002, The Journal of Biological Chemistry.

[59]  Cristian R. Munteanu,et al.  Enzymes/non-enzymes classification model complexity based on composition, sequence, 3D and topological indices. , 2008, Journal of theoretical biology.

[60]  Chris H. Q. Ding,et al.  Multi-class protein fold recognition using support vector machines and neural networks , 2001, Bioinform..

[61]  Kuo-Chen Chou,et al.  Predicting protein structural class by functional domain composition. , 2004, Biochemical and biophysical research communications.

[62]  Chuan Wang,et al.  Classification of protein quaternary structure by functional domain composition , 2006, BMC Bioinformatics.

[63]  K. Chou,et al.  Predicting Drug-Target Interaction Networks Based on Functional Groups and Biological Features , 2010, PloS one.

[64]  Kuo-Chen Chou,et al.  Predicting 22 protein localizations in budding yeast. , 2004, Biochemical and biophysical research communications.