An On/Off Lattice Approach to Protein Structure Prediction from Contact Maps

An important unsolved problem in structural bioinformatics is that of protein structure prediction (PSP), the reconstruction of a biologically plausible three-dimensional structure for a given protein given only its amino acid sequence. The PSP problem is of enormous interest, because the function of proteins is a direct consequence of their three-dimensional structure. Approaches to solve the PSP use protein models that range from very realistic (all-atom) to very simple (on a lattice). Finer representations usually generate better candidate structures, but are computationally more costly than the simpler on-lattice ones. In this work we propose a combined approach that makes use of a simple and fast lattice protein structure prediction algorithm, REMC-HPPFP, to compute a number of coarse candidate structures. These are later refined by 3Distill, an off-lattice, residue-level protein structure predictor. We prove that the lattice algorithm is able to bootstrap 3Distill, which consequently converges much faster, allowing for shorter execution times without noticeably degrading the quality of the predictions. This novel method allows us to generate a large set of decoys of quality comparable to those computed by the off-lattice method alone, but using a fraction of the computations. As a result, our method could be used to build large databases of predicted decoys for analysis, or for selecting the best candidate structures through reranking techniques. Furthermore our method is generic, in that it can be applied to other algorithms than 3Distill.

[1]  Alessandro Vullo,et al.  Distill: a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins , 2006, BMC Bioinformatics.

[2]  Hsiao-Ping Hsu,et al.  Growth-based optimization algorithm for lattice heteropolymers. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Jianlin Cheng,et al.  NNcon: improved protein contact map prediction using 2D-recursive neural networks , 2009, Nucleic Acids Res..

[4]  Antonio Turi,et al.  Lattices for ab initio protein structure prediction , 2008, Proteins.

[5]  Pierre Baldi,et al.  Improved residue contact prediction using support vector machines and a large feature set , 2007, BMC Bioinformatics.

[6]  Ceslovas Venclovas,et al.  Progress over the first decade of CASP experiments , 2005, Proteins.

[7]  Holger H. Hoos,et al.  An ant colony optimisation algorithm for the 2D and 3D hydrophobic polar protein folding problem , 2005, BMC Bioinformatics.

[8]  I ScottKirkpatrick Optimization by Simulated Annealing: Quantitative Studies , 1984 .

[9]  Holger H. Hoos,et al.  A replica exchange Monte Carlo algorithm for protein folding in the HP model , 2007, BMC Bioinformatics.

[10]  P. Zielenkiewicz,et al.  Why similar protein sequences encode similar three-dimensional structures? , 2010 .

[11]  Sue Whitesides,et al.  A complete and effective move set for simplified protein folding , 2003, RECOMB '03.

[12]  K. Dill Theory for the folding and stability of globular proteins. , 1985, Biochemistry.

[13]  Daniel Picot,et al.  Strategies for crystallizing membrane proteins , 1996 .

[14]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[15]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[16]  M Vendruscolo,et al.  Recovery of protein structure from contact maps. , 1997, Folding & design.

[17]  Natalio Krasnogor,et al.  Search strategies in structural bioinformatics. , 2008, Current protein & peptide science.