Template Scoring Methods for Protein Torsion Angle Prediction

Prediction of backbone torsion angles provides important constraints about the 3D structure of a protein and is receiving a growing interest in the structure prediction community. In this paper, we introduce a three-stage machine learning classifier to predict the 7-state torsion angles of a protein. The first two stages employ dynamic Bayesian and neural networks to produce an ab-initio prediction of torsion angle states starting from sequence profiles. The third stage is a committee classifier, which combines the ab-initio prediction with a structural frequency profile derived from templates obtained by HHsearch. We develop several structural profile models and obtain significant improvements over the Laplacian scoring technique through: (1) scaling templates by integer powers of sequence identity score, (2) incorporating other alignment scores as multiplicative factors (3) adjusting or optimizing parameters of the profile models with respect to the similarity interval of the target. We also demonstrate that the torsion angle prediction accuracy improves at all levels of target-template similarity even when templates are distant from the target. The improvement is at significantly higher rates as template structures gradually get closer to target.

[1]  Gianluca Pollastri,et al.  Beyond the Twilight Zone: Automated prediction of structural properties of proteins by recursive neural networks and remote homology information , 2009, Proteins.

[2]  Alessandro Vullo,et al.  Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks , 2009, BMC Structural Biology.

[3]  B. Rost,et al.  A modified definition of Sov, a segment‐based measure for protein secondary structure prediction assessment , 1999, Proteins.

[4]  Guoli Wang,et al.  PISCES: recent improvements to a PDB sequence culling server , 2005, Nucleic Acids Res..

[5]  Peisheng Cong,et al.  SPSSM8: an accurate approach for predicting eight-state secondary structures of proteins. , 2013, Biochimie.

[6]  Gajendra P. S. Raghava,et al.  Evaluation of Protein Dihedral Angle Prediction Methods , 2014, PloS one.

[7]  S. Henikoff,et al.  Position-based sequence weights. , 1994, Journal of molecular biology.

[8]  Jianlin Cheng,et al.  Machine Learning Methods for Protein Structure Prediction , 2008, IEEE Reviews in Biomedical Engineering.

[9]  A. Bax,et al.  TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts , 2009, Journal of biomolecular NMR.

[10]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[11]  Sitao Wu,et al.  ANGLOR: A Composite Machine-Learning Algorithm for Protein Backbone Torsion Angle Prediction , 2008, PloS one.

[12]  David Kim,et al.  Feature Selection Methods for Improving Protein Structure Prediction with Rosetta , 2007, NIPS.

[13]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[14]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[15]  Y. Duan,et al.  Trends in template/fragment-free protein structure prediction , 2010, Theoretical chemistry accounts.

[16]  Jeff A. Bilmes,et al.  Learning sparse models for a dynamic Bayesian network classifier of protein secondary structure , 2011, BMC Bioinformatics.

[17]  Lukasz A. Kurgan,et al.  SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles , 2012, J. Comput. Chem..

[18]  Geoffrey I. Webb,et al.  TANGLE: Two-Level Support Vector Regression Approach for Protein Backbone Torsion Angle Prediction from Primary Sequences , 2012, PloS one.

[19]  George Karypis,et al.  Introduction to Protein Structure Prediction: Methods and Algorithms , 2010 .

[20]  David S. Wishart,et al.  PREDITOR: a web server for predicting protein torsion angle restraints , 2006, Nucleic Acids Res..

[21]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[22]  Sitao Wu,et al.  MUSTER: Improving protein sequence profile–profile alignments by using multiple sources of structure information , 2008, Proteins.

[23]  Jeff G. Schneider,et al.  Protein subcellular location pattern classification in cellular images using latent discriminative models , 2012, Bioinform..

[24]  Peisheng Cong,et al.  DSP: a protein shape string and its profile prediction server , 2012, Nucleic Acids Res..

[25]  U. Hobohm,et al.  Enlarged representative set of protein structures , 1994, Protein science : a publication of the Protein Society.

[26]  Dapeng Li,et al.  A novel structural position-specific scoring matrix for the prediction of protein secondary structures , 2012, Bioinform..

[27]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[28]  Alessandro Vullo,et al.  Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information , 2007, BMC Bioinformatics.