Prediction of protein tertiary structural classes based on ensemble learning

When human enter the post-genomic era, prediction of protein structure has become the main object of study in bioinformatics. While predicting protein tertiary structural classes has become a new hot research topic in it. In this paper, a noval feature extraction method based on the predicted secondary structure sequence and the corresponding E-H sequence is employed. Two hierarchical classification models are designed and the model which achieves better prediction accuracy is chosen. On the basis of the new hierarchical classification model, ensemble learning is employed to predict protein tertiary structure. To test this method proposed by us, using three datasets with low homology including 640 dataset, 25pdb dataset and 1189 dataset is better option as the examine dataset of prediction of protein tertiary structural classes. The 10-fold cross validation test is employed to examine this method and compare with other existing methods. The overall accuracies of our method are 5.57%, 4.53% and 2.16% higher on the three different datasets, respectively.

[1]  Guo-Ping Zhou,et al.  Subcellular location prediction of apoptosis proteins , 2002, Proteins.

[2]  Chih-Jen Lin,et al.  A Comparison of Methods for Multi-class Support Vector Machines , 2015 .

[3]  Lukasz A. Kurgan,et al.  Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences , 2009, BMC Bioinformatics.

[4]  Yuehui Chen,et al.  A Novel Protein Structural Classes Prediction Method Based on Hierarchical Classification Model , 2015, 2015 8th International Conference on Intelligent Computation Technology and Automation (ICICTA).

[5]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[6]  Yang Li,et al.  A novel protein structural classes prediction method based on predicted secondary structure. , 2012, Biochimie.

[7]  Xian-Ming Pan,et al.  Accurate Prediction of Protein Structural Class , 2012, PloS one.

[8]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[9]  Xin Chen,et al.  Prediction of protein structural classes for low-homology sequences based on predicted secondary structure , 2010, BMC Bioinformatics.

[10]  Lukasz A. Kurgan,et al.  SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences , 2008, BMC Bioinformatics.

[11]  Tim J. P. Hubbard,et al.  SCOP database in 2004: refinements integrate structure and sequence family data , 2004, Nucleic Acids Res..

[12]  Zheng Yuan,et al.  How good is prediction of protein structural class by the component‐coupled method? , 2000, Proteins.

[13]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.