Genetic operators based on backbone constraint angles for protein structure prediction

This paper describes the development of genetic operators for protein structure prediction by using the frequency of occurrence of the main chain dihedral angles observed in 5,387 experimentally determined protein structures. The use of constraints in the conformation space accelerates the search by the genetic algorithm and improves its predictive capacity. The operators were tested in a crowding-based steady-state genetic algorithm implemented in the GAPF program, using a test set of eight proteins belonging to different classes (i.e., preferably α, α/β and preferably β). The results show that using operators with restricted backbone dihedral angles ranges, for each amino acid and secondary structure type, reduces up to 75% the number of energy function evaluations required to obtain equivalent results to those obtained without the new operators. Furthermore, these new operators increased the capacity of the algorithm to obtain better protein models, since for the two largest proteins from the test set, i.e. 1BDD and 1GYZ, the best predicted models had their GDT-TS value increased by more than 10% comparing with those values obtained using the standard version of the algorithm.

[1]  Eric A. Althoff,et al.  Kemp elimination catalysts by computational enzyme design , 2008, Nature.

[2]  J. Onuchic,et al.  Theory of Protein Folding This Review Comes from a Themed Issue on Folding and Binding Edited Basic Concepts Perfect Funnel Landscapes and Common Features of Folding Mechanisms , 2022 .

[3]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[4]  Haruki Nakamura,et al.  Announcing the worldwide Protein Data Bank , 2003, Nature Structural Biology.

[5]  Helio J. C. Barbosa,et al.  Full-atom ab initio protein structure prediction with a Genetic Algorithm using a similarity-based surrogate model , 2010, IEEE Congress on Evolutionary Computation.

[6]  Jerry Tsai,et al.  Some fundamental aspects of building protein structures from fragment libraries , 2004, Protein science : a publication of the Protein Society.

[7]  Guoli Wang,et al.  PISCES: recent improvements to a PDB sequence culling server , 2005, Nucleic Acids Res..

[8]  Pinak Chakrabarti,et al.  On residues in the disallowed region of the Ramachandran map. , 2002, Biopolymers.

[9]  J. Skolnick,et al.  Erratum: Scoring function for automated assessment of protein structure template quality (Proteins: Structure, Function and Genetics (2004) 57, (702-710)) , 2007 .

[10]  G. N. Ramachandran,et al.  Stereochemistry of polypeptide chain configurations. , 1963, Journal of molecular biology.

[11]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[12]  Ian W. Davis,et al.  RosettaLigand docking with full ligand and receptor flexibility. , 2009, Journal of molecular biology.

[13]  M. Moret,et al.  New stochastic strategy to analyze helix folding. , 2002, Biophysical journal.

[14]  David E. Kim,et al.  Free modeling with Rosetta in CASP6 , 2005, Proteins.

[15]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[16]  Helio J. C. Barbosa,et al.  A multiple minima genetic algorithm for protein structure prediction , 2014, Appl. Soft Comput..

[17]  S. Wodak,et al.  Prediction of protein backbone conformation based on seven structure assignments. Influence of local interactions. , 1991, Journal of molecular biology.

[18]  Yang Zhang,et al.  How significant is a protein structure similarity with TM-score = 0.5? , 2010, Bioinform..

[19]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[20]  J. Yon,et al.  Protein folding: concepts and perspectives , 1997, Cellular and Molecular Life Sciences CMLS.