LogitBoost classifier for discriminating thermophilic and mesophilic proteins.

A novel classifier, the so-called LogitBoost classifier, was introduced to discriminate the thermophilic and mesophilic proteins according to their primary structures. When the 20-amino acid composition was chosen as the feature vector, the overall accuracy of the self-consistency check and a five-fold cross-validation procedure was 97.0% and 86.6%, respectively. To test if the method was also applicable to a wide range of biological targets, an independent testing dataset was also used. The method based on LogitBoost algorithm has achieved an overall classification accuracy of 88.9%. According to the three different validation check approaches, it was demonstrated that LogitBoost outperformed AdaBoost and performed comparably with RBF neural network and support vector machine. The influence of protein size on discrimination was addressed.

[1]  L. Serrano,et al.  Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. , 2002, Journal of molecular biology.

[2]  C. Vieille,et al.  Hyperthermophilic Enzymes: Sources, Uses, and Molecular Mechanisms for Thermostability , 2001, Microbiology and Molecular Biology Reviews.

[3]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[4]  M. Gromiha,et al.  Important amino acid properties for enhanced thermostability from mesophilic to thermophilic proteins. , 1999, Biophysical chemistry.

[5]  Baishan Fang,et al.  Discrimination of thermophilic and mesophilic proteins via pattern recognition methods , 2006 .

[6]  G. Olsen,et al.  Thermal adaptation analyzed by comparison of protein sequences from mesophilic and extremely thermophilic Methanococcus species. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Arlo Z. Randall,et al.  Prediction of protein stability changes for single‐site mutations using support vector machines , 2005, Proteins.

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[9]  E. Querol,et al.  A simple electrostatic criterion for predicting the thermal stability of proteins. , 2003, Protein engineering.

[10]  M Michael Gromiha,et al.  Motifs in outer membrane protein sequences: applications for discrimination. , 2005, Biophysical chemistry.

[11]  Enrique Querol,et al.  Theoretical Analysis and Computational Predictions of Protein Thermostability , 2006 .

[12]  D Gilis,et al.  PoPMuSiC, an algorithm for predicting protein mutant stability changes: application to prion proteins. , 2000, Protein engineering.

[13]  Adam Godzik,et al.  Structural genomics of thermotoga maritima proteins shows that contact order is a major determinant of protein thermostability. , 2005, Structure.

[14]  Marcel Dettling,et al.  BagBoosting for tumor classification with gene expression data , 2004, Bioinform..

[15]  Jean-Michel Claverie,et al.  Genomic Correlates of Hyperthermostability, an Update* , 2003, The Journal of Biological Chemistry.

[16]  Ian H. Witten,et al.  Data mining in bioinformatics using Weka , 2004, Bioinform..

[17]  A. Szilágyi,et al.  Structural differences between mesophilic, moderately thermophilic and extremely thermophilic protein subunits: results of a comprehensive survey. , 2000, Structure.

[18]  D. Lynn,et al.  Synonymous codon usage is subject to selection in thermophilic bacteria. , 2002, Nucleic acids research.

[19]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[20]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[21]  Fredj Tekaia,et al.  Amino acid composition of genomes, lifestyles of organisms, and evolutionary trends: a global picture with correspondence analysis. , 2002, Gene.

[22]  Yu-Bin Yang,et al.  Lung cancer cell identification based on artificial neural network ensembles , 2002, Artif. Intell. Medicine.

[23]  M. Gromiha,et al.  Relationship Between Amino Acid Properties and Protein Stability: Buried Mutations , 1999, Journal of protein chemistry.