GPCR-2L: predicting G protein-coupled receptors and their types by hybridizing two different modes of pseudo amino acid compositions.

G protein-coupled receptors (GPCRs) are among the most frequent targets of therapeutic drugs. With the avalanche of newly generated protein sequences in the post genomic age, to expedite the process of drug discovery, it is highly desirable to develop an automated method to rapidly identify GPCRs and their types. A new predictor was developed by hybridizing two different modes of pseudo-amino acid composition (PseAAC): the functional domain PseAAC and the low-frequency Fourier spectrum PseAAC. The new predictor is called GPCR-2L, where "2L" means that it is a two-layer predictor: the 1st layer prediction engine is to identify a query protein as GPCR or not; if it is, the prediction will be automatically continued to further identify it as belonging to one of the following six types: (1) rhodopsin-like (Class A), (2) secretin-like (Class B), (3) metabotropic glutamate/pheromone (Class C), (4) fungal pheromone (Class D), (5) cAMP receptor (Class E), or (6) frizzled/smoothened family (Class F). The overall success rate of GPCR-2L in identifying proteins as GPCRs or non-GPCRs is over 97.2%, while identifying GPCRs among their six types is over 97.8%. Such high success rates were derived by the rigorous jackknife cross-validation on a stringent benchmark dataset, in which none of the included proteins had ≥40% pairwise sequence identity to any other protein in a same subset. As a user-friendly web-server, GPCR-2L is freely accessible to the public at http://icpr.jci.edu.cn/, by which one can obtain the 2-level results in about 20 s for a query protein sequence of 500 amino acids. The longer the sequence is, the more time it may usually need. The high success rates reported here indicate that it is a quite effective approach to identify GPCRs and their types with the functional domain information and the low-frequency Fourier spectrum analysis. It is anticipated that GPCR-2L may become a useful tool for both basic research and drug development in the areas related to GPCRs.

[1]  Kuo-Chen Chou,et al.  Using pseudo amino acid composition to predict protein structural classes: Approached with complexity measure factor , 2006, J. Comput. Chem..

[2]  Kuo-Chen Chou,et al.  Binding mechanism of coronavirus main proteinase with ligands and its implication to drug design against SARS , 2003, Biochemical and Biophysical Research Communications.

[3]  C. Zhang,et al.  Predicting protein folding types by distance functions that make allowances for amino acid interactions. , 1994, The Journal of biological chemistry.

[4]  K. Chou Pseudo Amino Acid Composition and its Applications in Bioinformatics, Proteomics and System Biology , 2009 .

[5]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[6]  K. Chou,et al.  Bioinformatical analysis of G-protein-coupled receptors. , 2002, Journal of proteome research.

[7]  K. Chou,et al.  ProtIdent: a web server for identifying proteases and their types by fusing functional domain and sequential evolution information. , 2008, Biochemical and biophysical research communications.

[8]  Kuo-Chen Chou,et al.  Molecular therapeutic target for type-2 diabetes. , 2004, Journal of proteome research.

[9]  Da-Peng Li,et al.  Amino Acid Principal Component Analysis (AAPCA) and its Applications in Protein Structural Class Prediction , 2006, Journal of biomolecular structure & dynamics.

[10]  J. Chou,et al.  The structure of phospholamban pentamer reveals a channel-like architecture in membranes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Z. Wen,et al.  Delaunay triangulation with partial least squares projection to latent structures: a model for G-protein coupled receptors classification and fast structure recognition , 2007, Amino Acids.

[12]  Kuo-Chen Chou,et al.  Analysis of Protein Pathway Networks Using Hybrid Properties , 2010, Molecules.

[13]  J. Chou,et al.  Mechanism of drug inhibition and drug resistance of influenza A M2 channel , 2009, Proceedings of the National Academy of Sciences.

[14]  M. Esmaeili,et al.  Using the concept of Chou's pseudo amino acid composition for risk type prediction of human papillomaviruses. , 2010, Journal of theoretical biology.

[15]  Lourdes Santana,et al.  Proteomics, networks and connectivity indices , 2008, Proteomics.

[16]  K. Chou,et al.  Study of drug resistance of chicken influenza A virus (H5N1) from homology-modeled 3D structures of neuraminidases. , 2007, Biochemical and biophysical research communications.

[17]  Parviz Abdolmaleki,et al.  Novel two-stage hybrid neural discriminant model for predicting proteins structural classes. , 2007, Biophysical chemistry.

[18]  Kuo-Chen Chou,et al.  Insights from modeling the 3D structure of H5N1 influenza virus neuraminidase and its binding interactions with ligands. , 2006, Biochemical and biophysical research communications.

[19]  Shawn M. Douglas,et al.  DNA-nanotube-induced alignment of membrane proteins for NMR structure determination , 2007, Proceedings of the National Academy of Sciences.

[20]  Xiaoyong Zou,et al.  Prediction of protein secondary structure content by using the concept of Chou's pseudo amino acid composition and support vector machine. , 2009, Protein and peptide letters.

[21]  K. Chou,et al.  Predicting the quaternary structure attribute of a protein by hybridizing functional domain composition and pseudo amino acid composition , 2009 .

[22]  K. Chou,et al.  Protein subcellular location prediction. , 1999, Protein engineering.

[23]  Hao Lin The modified Mahalanobis Discriminant for predicting outer membrane proteins by using Chou's pseudo amino acid composition. , 2008, Journal of theoretical biology.

[24]  K. Chou,et al.  Prediction of membrane protein types and subcellular locations , 1999, Proteins.

[25]  Kuo-Chen Chou,et al.  MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. , 2007, Biochemical and biophysical research communications.

[26]  P. Aloy,et al.  Relation between amino acid composition and cellular location of proteins. , 1997, Journal of molecular biology.

[27]  Kuo-Chen Chou,et al.  Insights from modelling the 3D structure of the extracellular domain of alpha7 nicotinic acetylcholine receptor. , 2004, Biochemical and biophysical research communications.

[28]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[29]  J. Chou,et al.  Solution NMR structure of the V27A drug resistant mutant of influenza A M2 channel. , 2010, Biochemical and biophysical research communications.

[30]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[31]  Kuo-Chen Chou,et al.  The convergence‐divergence duality in lectin domains of selectin family and its implications , 1995, FEBS letters.

[32]  Kuo-Chen Chou,et al.  Molecular modeling of cytochrome P450 and drug metabolism. , 2010, Current drug metabolism.

[33]  Hao Lin,et al.  Predicting subcellular localization of mycobacterial proteins by using Chou's pseudo amino acid composition. , 2008, Protein and peptide letters.

[34]  Guo-Ping Zhou,et al.  An Intriguing Controversy over Protein Structural Class Prediction , 1998, Journal of protein chemistry.

[35]  Zheng-Zhi Wang,et al.  Classification of G-protein coupled receptors at four levels. , 2006, Protein engineering, design & selection : PEDS.

[36]  G P Zhou,et al.  Some insights into protein structural class prediction , 2001, Proteins.

[37]  Z. Huang,et al.  Using cellular automata images and pseudo amino acid composition to predict protein subcellular location , 2005, Amino Acids.

[38]  R. Lefkowitz The superfamily of heptahelical receptors , 2000, Nature Cell Biology.

[39]  K. Chou,et al.  Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular Location* , 2002, The Journal of Biological Chemistry.

[40]  Kuo-Chen Chou,et al.  Predicting protein structural class with AdaBoost Learner. , 2006, Protein and peptide letters.

[41]  Kuo-Chen Chou,et al.  Prediction of G-protein-coupled receptor classes. , 2005, Journal of proteome research.

[42]  Kuo-Chen Chou,et al.  A New Method for Predicting the Subcellular Localization of Eukaryotic Proteins with Both Single and Multiple Sites: Euk-mPLoc 2.0 , 2010, PloS one.

[43]  C. DeLisi,et al.  Prediction of protein structural class from the amino acid sequence , 1986, Biopolymers.

[44]  Kuo-Chen Chou,et al.  Three new powerful oseltamivir derivatives for inhibiting the neuraminidase of influenza virus. , 2010, Biochemical and biophysical research communications.

[45]  M. Wang,et al.  Low-frequency Fourier spectrum for predicting membrane protein types. , 2005, Biochemical and biophysical research communications.

[46]  Kuo-Chen Chou,et al.  Fuzzy KNN for predicting membrane protein types from pseudo-amino acid composition. , 2006, Journal of theoretical biology.

[47]  Zhanchao Li,et al.  Using Chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. , 2007, Journal of theoretical biology.

[48]  B. Roth,et al.  G protein-coupled receptor (GPCR) trafficking in the central nervous system: relevance for drugs of abuse. , 1998, Drug and alcohol dependence.

[49]  K. Chou,et al.  Prediction of the tertiary structure of a caspase‐9/inhibitor complex , 2000, FEBS letters.

[50]  Hao Lin,et al.  Prediction of Subcellular Localization of Apoptosis Protein Using Chou’s Pseudo Amino Acid Composition , 2009, Acta biotheoretica.

[51]  P. Klein,et al.  Prediction of protein structural class by discriminant analysis. , 1986, Biochimica et biophysica acta.

[52]  Fengmin Li,et al.  Predicting protein subcellular location using Chou's pseudo amino acid composition and improved hybrid approach. , 2008, Protein and peptide letters.

[53]  X. Xiao,et al.  Using adaptive K-nearest neighbor algorithm and cellular automata images to predicting G-protein-coupled receptor classes , 2010, Interdisciplinary Sciences: Computational Life Sciences.

[54]  K Nishikawa,et al.  Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies. , 1994, Journal of molecular biology.

[55]  K. Chou,et al.  Plant-mPLoc: A Top-Down Strategy to Augment the Power for Predicting Plant Protein Subcellular Localization , 2010, PloS one.

[56]  D. Connelly,et al.  Cross‐validation of protein structural class prediction using statistical clustering and neural networks , 1993, Protein science : a publication of the Protein Society.

[57]  Yongsheng Ding,et al.  Using Chou's pseudo amino acid composition to predict subcellular localization of apoptosis proteins: An approach with immune genetic algorithm-based ensemble classifier , 2008, Pattern Recognit. Lett..

[58]  J. Chou,et al.  Structure and mechanism of the M2 proton channel of influenza A virus , 2008, Nature.

[59]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[60]  Guangya Zhang,et al.  Predicting lipase types by improved Chou's pseudo-amino acid composition. , 2008, Protein and peptide letters.

[61]  J. Nieto,et al.  Use of fuzzy clustering technique and matrices to classify amino acids and its impact to Chou's pseudo amino acid composition. , 2009, Journal of theoretical biology.

[62]  K. Chou,et al.  Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. , 2000, Biochemical and biophysical research communications.

[63]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[64]  K. Chou,et al.  A study on the correlation of G-protein-coupled receptor types with amino acid composition. , 2002, Protein engineering.

[65]  K. Chou,et al.  Predicting protein structural classes from amino acid composition: application of fuzzy clustering. , 1995, Protein engineering.

[66]  K. R. Woods,et al.  Prediction of protein antigenic determinants from amino acid sequences. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[67]  J. Chou,et al.  The structural basis for intramembrane assembly of an activating immunoreceptor complex , 2010, Nature Immunology.

[68]  Z. Huang,et al.  Using complexity measure factor to predict protein subcellular location , 2005, Amino Acids.

[69]  K. Chou A novel approach to predicting protein structural classes in a (20–1)‐D amino acid composition space , 1995, Proteins.

[70]  Peer Bork,et al.  SMART 5: domains in the context of genomes and networks , 2005, Nucleic Acids Res..

[71]  T. Gudermann,et al.  Receptors and G proteins as primary components of transmembrane signal transduction , 1995, Journal of Molecular Medicine.

[72]  K. Chou,et al.  Prediction of Protein Structural Classes by Modified Mahalanobis Discriminant Algorithm , 1998, Journal of protein chemistry.

[73]  K. Chou,et al.  Predicting protein structural classes with pseudo amino acid composition: an approach using geometric moments of cellular automaton image. , 2008, Journal of theoretical biology.

[74]  Kuo-Chen Chou,et al.  Modelling extracellular domains of GABA-A receptors: subtypes 1, 2, 3, and 5. , 2004, Biochemical and biophysical research communications.

[75]  K. Chou,et al.  Low-frequency resonance and cooperativity of hemoglobin. , 1989, Trends in biochemical sciences.

[76]  James J. Chou,et al.  The Structure of the ζζ Transmembrane Dimer Reveals Features Essential for Its Assembly with the T Cell Receptor , 2006, Cell.

[77]  K. Chou,et al.  Predicting Drug-Target Interaction Networks Based on Functional Groups and Biological Features , 2010, PloS one.

[78]  Kuo-Chen Chou,et al.  GPCR‐CA: A cellular automaton image approach for predicting G‐protein–coupled receptor functional classes , 2009, J. Comput. Chem..

[79]  Narmada Thanki,et al.  CDD: a conserved domain database for interactive domain family analysis , 2006, Nucleic Acids Res..

[80]  J. Chou,et al.  Solution structure and functional analysis of the influenza B proton channel , 2009, Nature Structural &Molecular Biology.

[81]  K. Chou Structural bioinformatics and its impact to biomedical science. , 2004, Current medicinal chemistry.

[82]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[83]  K. Chou,et al.  Recent progress in protein subcellular location prediction. , 2007, Analytical biochemistry.

[84]  Yanzhi Guo,et al.  Predicting DNA-binding proteins: approached from Chou’s pseudo amino acid composition and other specific sequence features , 2007, Amino Acids.

[85]  John C. Wootton,et al.  Statistics of Local Complexity in Amino Acid Sequences and Sequence Databases , 1993, Comput. Chem..

[86]  Kuo-Chen Chou,et al.  Prediction of protein structure classes with pseudo amino acid composition and fuzzy support vector machine network. , 2007, Protein and peptide letters.

[87]  Q Gu,et al.  Prediction of G-protein-coupled receptor classes in low homology using Chou's pseudo amino acid composition with approximate entropy and hydrophobicity patterns. , 2010, Protein and peptide letters.

[88]  J. Baldwin,et al.  Structure and function of receptors coupled to G proteins. , 1994, Current opinion in cell biology.

[89]  K. Chou,et al.  Support vector machines for predicting membrane protein types by using functional domain composition. , 2003, Biophysical journal.

[90]  Guo-Ping Zhou,et al.  Subcellular location prediction of apoptosis proteins , 2002, Proteins.

[91]  Jianding Qiu,et al.  Prediction of G-protein-coupled receptor classes based on the concept of Chou's pseudo amino acid composition: an approach from discrete wavelet transform. , 2009, Analytical biochemistry.

[92]  Hao Lin,et al.  Prediction of cell wall lytic enzymes using Chou's amphiphilic pseudo amino acid composition. , 2009, Protein and peptide letters.

[93]  Suhai, S. (ed.): Theoretical and Computational Methods in Genome Research , 1997, Biologia Plantarum.

[94]  Guangya Zhang,et al.  Predicting the cofactors of oxidoreductases based on amino acid composition distribution and Chou's amphiphilic pseudo-amino acid composition. , 2008, Journal of theoretical biology.

[95]  Xiaoying Jiang,et al.  Using the concept of Chou's pseudo amino acid composition to predict apoptosis proteins subcellular location: an approach by approximate entropy. , 2008, Protein and peptide letters.

[96]  Kuo-Chen Chou,et al.  Boosting classifier for predicting protein domain structural class. , 2005, Biochemical and biophysical research communications.

[97]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[98]  Yanzhi Guo,et al.  Using the augmented Chou's pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach. , 2009, Journal of theoretical biology.

[99]  K. Chou,et al.  Low-frequency collective motion in biomacromolecules and its biological functions. , 1988, Biophysical chemistry.

[100]  K. Chou,et al.  Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms , 2008, Nature Protocols.