Application of amino acid distribution along the sequence for discriminating mesophilic and thermophilic proteins

In this work, we have systematically analyzed the distribution of two neighboring amino acids in the sequences of thermophilic and mesophilic proteins. We observed that the occurrence of EE, KK, RR, PP, KI, VV, VE, KE and VK in thermophilic proteins were significantly higher, while the occurrence of QQ, AA, EQ, LL, QA, QL, NN, KQ, QG, RQ, QT and AQ were significantly lower. The thermostable mechanism was studied and we thought that the dipeptide composition contained more information than amino acid composition. Based on the information of dipeptide composition, we have developed a statistical method for discriminating thermophilic and mesophilic proteins. The accuracy of our method for the training dataset was 86.3%. Furthermore, the accuracy of the method for another two independent testing datasets was 85.5 and 89.7%, respectively. The influence of some specific dipeptides on prediction accuracy was also discussed.

[1]  Stephen L. Mayo,et al.  Design, structure and stability of a hyperthermophilic protein variant , 1998, Nature Structural Biology.

[2]  E. Querol,et al.  Analysis of protein conformational characteristics related to thermostability. , 1996, Protein engineering.

[3]  C. Vieille,et al.  Hyperthermophilic Enzymes: Sources, Uses, and Molecular Mechanisms for Thermostability , 2001, Microbiology and Molecular Biology Reviews.

[4]  T M Handel,et al.  Review: protein design--where we were, where we are, where we're going. , 2001, Journal of structural biology.

[5]  M. Gromiha,et al.  Relationship Between Amino Acid Properties and Protein Stability: Buried Mutations , 1999, Journal of protein chemistry.

[6]  M. Gerstein,et al.  The stability of thermophilic proteins: a study based on comprehensive genome comparison , 2000, Functional & Integrative Genomics.

[7]  B. Dahiyat,et al.  Combining computational and experimental screening for rapid optimization of protein properties , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[8]  R. Nussinov,et al.  Factors enhancing protein thermostability. , 2000, Protein engineering.

[9]  K C Chou,et al.  Prediction of protein structural classes and subcellular locations. , 2000, Current protein & peptide science.

[10]  Yujie Cai,et al.  The influence of dipeptide composition on protein thermostability , 2004, FEBS letters.

[11]  D. Lynn,et al.  Synonymous codon usage is subject to selection in thermophilic bacteria. , 2002, Nucleic acids research.

[12]  R. Nussinov,et al.  How do thermophilic proteins deal with heat? , 2001, Cellular and Molecular Life Sciences CMLS.

[13]  Jean-Michel Claverie,et al.  Genomic Correlates of Hyperthermostability, an Update* , 2003, The Journal of Biological Chemistry.

[14]  A. Szilágyi,et al.  Structural differences between mesophilic, moderately thermophilic and extremely thermophilic protein subunits: results of a comprehensive survey. , 2000, Structure.

[15]  S. Pack,et al.  Protein thermostability: structure-based difference of amino acid between thermophilic and mesophilic proteins. , 2004, Journal of biotechnology.

[16]  Fredj Tekaia,et al.  Amino acid composition of genomes, lifestyles of organisms, and evolutionary trends: a global picture with correspondence analysis. , 2002, Gene.

[17]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[18]  M. Gromiha,et al.  Important amino acid properties for enhanced thermostability from mesophilic to thermophilic proteins. , 1999, Biophysical chemistry.

[19]  Baishan Fang,et al.  Discrimination of thermophilic and mesophilic proteins via pattern recognition methods , 2006 .

[20]  G. Olsen,et al.  Thermal adaptation analyzed by comparison of protein sequences from mesophilic and extremely thermophilic Methanococcus species. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Shandar Ahmad,et al.  Application of residue distribution along the sequence for discriminating outer membrane proteins , 2005, Comput. Biol. Chem..

[22]  E. Querol,et al.  A simple electrostatic criterion for predicting the thermal stability of proteins. , 2003, Protein engineering.