Classifying G-protein-coupled receptors to the finest subtype level.

G-protein-coupled receptors (GPCRs) constitute a remarkable protein family of receptors that are involved in a broad range of biological processes. A large number of clinically used drugs elicit their biological effect via a GPCR. Thus, developing a reliable computational method for predicting the functional roles of GPCRs would be very useful in the pharmaceutical industry. Nowadays, researchers are more interested in functional roles of GPCRs at the finest subtype level. However, with the accumulation of many new protein sequences, none of the existing methods can completely classify these GPCRs to their finest subtype level. In this paper, a pioneer work was performed trying to resolve this problem by using a hierarchical classification method. The first level determines whether a query protein is a GPCR or a non-GPCR. If it is considered as a GPCR, it will be finally classified to its finest subtype level. GPCRs are characterized by 170 sequence-derived features encapsulating both amino acid composition and physicochemical features of proteins, and support vector machines are used as the classification engine. To test the performance of the present method, a non-redundant dataset was built which are organized at seven levels and covers more functional classes of GPCRs than existing datasets. The number of protein sequences in each level is 5956, 2978, 8079, 8680, 6477, 1580 and 214, respectively. By 5-fold cross-validation test, the overall accuracy of 99.56%, 93.96%, 82.81%, 85.93%, 94.1%, 95.38% and 92.06% were observed at each level. When compared with some previous methods, the present method achieved a consistently higher overall accuracy. The results demonstrate the power and effectiveness of the proposed method to accomplish the classification of GPCRs to the finest subtype level.

[1]  Kurt Kristiansen,et al.  Molecular mechanisms of ligand binding, signaling, and regulation within the superfamily of G-protein-coupled receptors: molecular modeling and mutagenesis approaches to receptor structure and function. , 2004, Pharmacology & therapeutics.

[2]  K. Chou Some remarks on protein attribute prediction and pseudo amino acid composition , 2010, Journal of Theoretical Biology.

[3]  Cheng Wu,et al.  Prediction of nuclear receptors with optimal pseudo amino acid composition. , 2009, Analytical biochemistry.

[4]  John P. Overington,et al.  How many drug targets are there? , 2006, Nature Reviews Drug Discovery.

[5]  Hongyu Zhao,et al.  Prediction of pattern recognition receptor family using pseudo-amino acid composition. , 2012, Biochemical and biophysical research communications.

[6]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[7]  Gert Vriend,et al.  GPCRDB information system for G protein-coupled receptors , 2003, Nucleic Acids Res..

[8]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[9]  Gajendra P. S. Raghava,et al.  GPCRpred: an SVM-based method for prediction of families and subfamilies of G-protein coupled receptors , 2004, Nucleic Acids Res..

[10]  Kuo-Chen Chou,et al.  GPCR-GIA: a web-server for identifying G-protein coupled receptors and their families with grey incidence analysis. , 2009, Protein engineering, design & selection : PEDS.

[11]  Kuo-Chen Chou,et al.  Prediction of G-protein-coupled receptor classes. , 2005, Journal of proteome research.

[12]  K. Chou,et al.  Bioinformatical analysis of G-protein-coupled receptors. , 2002, Journal of proteome research.

[13]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[14]  Zheng-Zhi Wang,et al.  Classification of G-protein coupled receptors at four levels. , 2006, Protein engineering, design & selection : PEDS.

[15]  Kuo-Chen Chou,et al.  Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes , 2005, Bioinform..

[16]  Kuldip Singh,et al.  A Novel and Efficient Technique for Identification and Classification of GPCRs , 2008, IEEE Transactions on Information Technology in Biomedicine.

[17]  Cheng Wu,et al.  Classification of amine type G-protein coupled receptors with feature selection. , 2008, Protein and peptide letters.

[18]  Jon Timmis,et al.  Proteomic applications of automated GPCR classification , 2007, Proteomics.

[19]  K. Chou,et al.  A study on the correlation of G-protein-coupled receptor types with amino acid composition. , 2002, Protein engineering.

[20]  Peter L. Freddolino,et al.  Prediction of structure and function of G protein-coupled receptors , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[21]  S. Rasmussen,et al.  The structure and function of G-protein-coupled receptors , 2009, Nature.

[22]  Asifullah Khan,et al.  G-protein-coupled receptor prediction using pseudo-amino-acid composition and multiscale energy representation of different physiochemical properties. , 2011, Analytical biochemistry.

[23]  Xin Chen,et al.  An improved classification of G-protein-coupled receptors using sequence-derived features , 2010, BMC Bioinformatics.

[24]  Gajendra P. S. Raghava,et al.  GPCRsclass: a web tool for the classification of amine type of G-protein-coupled receptors , 2005, Nucleic Acids Res..

[25]  Jun Cai,et al.  Classifying G-protein coupled receptors with bagging classification tree , 2004, Comput. Biol. Chem..

[26]  Rodrigo Lopez,et al.  WU-Blast2 server at the European Bioinformatics Institute , 2003, Nucleic Acids Res..

[27]  G. Li,et al.  Classifying G protein-coupled receptors and nuclear receptors on the basis of protein power spectrum from fast Fourier transform , 2006, Amino Acids.

[28]  Jianding Qiu,et al.  Prediction of G-protein-coupled receptor classes based on the concept of Chou's pseudo amino acid composition: an approach from discrete wavelet transform. , 2009, Analytical biochemistry.

[29]  Kuo-Chen Chou,et al.  GPCR‐CA: A cellular automaton image approach for predicting G‐protein–coupled receptor functional classes , 2009, J. Comput. Chem..

[30]  Alex Alves Freitas,et al.  On the hierarchical classification of G protein-coupled receptors , 2007, Bioinform..

[31]  Q Gu,et al.  Prediction of G-protein-coupled receptor classes in low homology using Chou's pseudo amino acid composition with approximate entropy and hydrophobicity patterns. , 2010, Protein and peptide letters.

[32]  David Haussler,et al.  Classifying G-protein coupled receptors with support vector machines , 2002, Bioinform..

[33]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[34]  K. Chou,et al.  PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition. , 2008, Analytical biochemistry.