A protein structural class prediction method based on novel features.

In this study, a 12-dimensional feature vector is constructed to reflect the general contents and spatial arrangements of the secondary structural elements of a given protein sequence. Among the 12 features, 6 novel features are specially designed to improve the prediction accuracies for α/β and α + β classes based on the distributions of α-helices and β-strands and the characteristics of parallel β-sheets and anti-parallel β-sheets. To evaluate our method, the jackknife cross-validating test is employed on two widely-used datasets, 25PDB and 1189 datasets with sequence similarity lower than 40% and 25%, respectively. The performance of our method outperforms the recently reported methods in most cases, and the 6 newly-designed features have significant positive effect to the prediction accuracies, especially for α/β and α + β classes.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.

[3]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[4]  K. Chou Progress in protein structural class prediction and its impact to bioinformatics and proteomics. , 2005, Current protein & peptide science.

[5]  Yang Li,et al.  A novel protein structural classes prediction method based on predicted secondary structure. , 2012, Biochimie.

[6]  K. Chou,et al.  Recent progress in protein subcellular location prediction. , 2007, Analytical biochemistry.

[7]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[8]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[9]  Lukasz A. Kurgan,et al.  SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences , 2008, BMC Bioinformatics.

[10]  K. Chou,et al.  A key driving force in determination of protein structural classes. , 1999, Biochemical and biophysical research communications.

[11]  Lukasz A. Kurgan,et al.  Prediction of structural classes for protein sequences and domains - Impact of prediction algorithms, sequence representation and homology, and test procedures on accuracy , 2006, Pattern Recognit..

[12]  R. Jernigan,et al.  Understanding the recognition of protein structural classes by amino acid composition , 1997, Proteins.

[13]  M M Gromiha,et al.  Protein secondary structure prediction in different structural classes. , 1998, Protein engineering.

[14]  Shengli Zhang,et al.  High-accuracy prediction of protein structural class for low-similarity sequences based on predicted secondary structure. , 2011, Biochimie.

[15]  Scott Dick,et al.  Classifier ensembles for protein structural class prediction with varying homology. , 2006, Biochemical and biophysical research communications.

[16]  Zheng Yuan,et al.  Prediction of protein B‐factor profiles , 2005, Proteins.

[17]  Cangzhi Jia,et al.  A high-accuracy protein structural class prediction algorithm using predicted secondary structural information. , 2010, Journal of theoretical biology.

[18]  Angelo M Facchiano,et al.  Prediction of the protein structural class by specific peptide frequencies. , 2009, Biochimie.

[19]  Zhirong Sun,et al.  Support vector machine approach for protein subcellular localization prediction , 2001, Bioinform..

[20]  Xiaoqi Zheng,et al.  Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile. , 2010, Biochimie.

[21]  Xin Chen,et al.  Prediction of protein structural classes for low-homology sequences based on predicted secondary structure , 2010, BMC Bioinformatics.

[22]  Zheng Yuan,et al.  How good is prediction of protein structural class by the component‐coupled method? , 2000, Proteins.

[23]  K. Chou A novel approach to predicting protein structural classes in a (20–1)‐D amino acid composition space , 1995, Proteins.

[24]  G M Maggiora,et al.  A heuristic approach to predicting the tertiary structure of bovine somatotropin. , 1991, Biochemistry.

[25]  Lukasz A. Kurgan,et al.  Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences , 2009, BMC Bioinformatics.

[26]  Zu-Guo Yu,et al.  Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation. , 2009 .

[27]  Zheng Yuan,et al.  Prediction of protein accessible surface areas by support vector regression , 2004, Proteins.

[28]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[29]  K Nishikawa,et al.  The folding type of a protein is relevant to the amino acid composition. , 1986, Journal of biochemistry.

[30]  K. Chou,et al.  Energy-optimized structure of antifreeze protein and its binding mechanism. , 1992, Journal of molecular biology.