Combining classifiers for protein secondary structure prediction

Protein secondary structure prediction is an important step in estimating the three dimensional structure of proteins. Among the many methods developed for predicting structural properties of proteins, hybrid classifiers and ensembles that combine predictions from several models are shown to improve the accuracy rates. In this paper, we train, optimize and combine a support vector machine, a deep convolutional neural field and a random forest in the second stage of a hybrid classifier for protein secondary structure prediction. We demonstrate that the overall accuracy of the proposed ensemble is comparable to the success rates of the state-of-the-art methods in the most difficult prediction setting and combining the selected models have the potential to further improve the accuracy of the base learners.

[1]  S. Hua,et al.  A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. , 2001, Journal of molecular biology.

[2]  Pierre Baldi,et al.  SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity , 2014, Bioinform..

[3]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[4]  Belhadri Messabih,et al.  Effect of simple ensemble methods on protein secondary structure prediction , 2014, Soft Computing.

[5]  Bernard F. Buxton,et al.  Secondary structure prediction with support vector machines , 2003, Bioinform..

[6]  Abdollah Dehzangi,et al.  Ensemble of Neural Networks to Solve Class Imbalance Problem of Protein Secondary Structure Prediction , 2012 .

[7]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[8]  Adel Said Elmaghraby,et al.  Is it better to combine predictions? , 2000, Protein engineering.

[9]  Pierre Baldi,et al.  Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles , 2002, Proteins.

[10]  Lukasz A. Kurgan,et al.  Prediction of protein structural class using novel evolutionary collocation‐based sequence representation , 2008, J. Comput. Chem..

[11]  Vasilis J. Promponas,et al.  A Comparative Study on Filtering Protein Secondary Structure Prediction , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  Marimuthu Palaniswami,et al.  Protein Secondary Structure Prediction Using Support Vector Machines and a New Feature Representation , 2006, Int. J. Comput. Intell. Appl..

[13]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[14]  Jeff A. Bilmes,et al.  Learning sparse models for a dynamic Bayesian network classifier of protein secondary structure , 2011, BMC Bioinformatics.

[15]  P. Baldi,et al.  Prediction of coordination number and relative solvent accessibility in proteins , 2002, Proteins.

[16]  Abdollah Dehzangi,et al.  A Combination of Feature Extraction Methods with an Ensemble of Different Classifiers for Protein Structural Class Prediction Problem , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Xin-Qiu Yao,et al.  A dynamic Bayesian network approach to protein secondary structure prediction , 2008, BMC Bioinformatics.

[18]  Hyunsoo Kim,et al.  Protein secondary structure prediction based on an improved support vector machines approach. , 2003, Protein engineering.

[19]  G J Barton,et al.  Evaluation and improvement of multiple sequence methods for protein secondary structure prediction , 1999, Proteins.

[20]  Jian Peng,et al.  Conditional Neural Fields , 2009, NIPS.

[21]  Wei Li,et al.  Multi-layer Ensemble Classifiers on Protein Secondary Structure Prediction , 2008, ICIC.

[22]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[23]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[24]  Jian Peng,et al.  Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields , 2015, Scientific Reports.