Genetic algorithm feature-based resampling for protein structure prediction

Proteins carry out the majority of functionality on a cellular level. Computational protein structure prediction (PSP) methods have been introduced to speed up the PSP process due to manual methods, like nuclear magnetic resonance (NMR) and x-ray crystallography (XC) taking numerous months even years to produce a predicted structure for a target protein. A lot of work in this area is focused on the type of search strategy to employ. Two popular methods in the literature are: Monte Carlo based algorithms and Genetic Algorithms. Genetic Algorithms (GA) have proven to be quite useful in the PSP field, as they allow for a generic search approach, which alleviates the need to redefine the search strategies for separate sequences. They also lend themselves well to feature-based resampling techniques. Feature-based resampling works by taking previously computed local minima and combining features from them to create new structures that are more uniformly low in free energy. In this work we present a feature-based resampling genetic algorithm to refine structures that are outputted by PSP software. Our results indicate that our approach performs well, and produced an average 9.5% root mean square deviation (RMSD) improvement and a 17.36% template modeling score (TM-Score) improvement.

[1]  R Samudrala,et al.  Ab initio construction of protein tertiary structures using a hierarchical approach. , 2000, Journal of molecular biology.

[2]  J. Skolnick,et al.  Ab initio modeling of small proteins by iterative TASSER simulations , 2007, BMC Biology.

[3]  Michael I. Jordan,et al.  Resampling methods for protein structure prediction , 2008 .

[4]  Jens Meiler,et al.  Rosetta predictions in CASP5: Successes, failures, and prospects for complete automation , 2003, Proteins.

[5]  N Gautham,et al.  Protein structure prediction using mutually orthogonal Latin squares and a genetic algorithm. , 2006, Biochemical and biophysical research communications.

[6]  J. Skolnick,et al.  Automated structure prediction of weakly homologous proteins on a genomic scale. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[7]  G M Crippen,et al.  Significance of root-mean-square deviation in comparing three-dimensional structures of globular proteins. , 1994, Journal of molecular biology.

[8]  Songde Ma,et al.  Protein folding simulations of the hydrophobic–hydrophilic model by combining tabu search with genetic algorithms , 2003 .

[9]  S. Toma,et al.  Contact interactions method: A new algorithm for protein folding simulations , 1996, Protein science : a publication of the Protein Society.

[10]  Abdul Sattar,et al.  Protein folding prediction in 3D FCC HP lattice model using genetic algorithm , 2007, 2007 IEEE Congress on Evolutionary Computation.

[11]  Abdul Sattar,et al.  Extended HP Model for Protein Structure Prediction , 2009, J. Comput. Biol..

[12]  Ron Unger,et al.  On the applicability of genetic algorithms to protein folding , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[13]  Andrew Lewis,et al.  Twin Removal in Genetic Algorithms for Protein Structure Prediction Using Low-Resolution Model , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[14]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[15]  J. Moult,et al.  Ab initio structure prediction for small polypeptides and protein fragments using genetic algorithms , 1995, Proteins.

[16]  K Murugesan,et al.  A multi-objective evolutionary algorithm for protein structure prediction with immune operators , 2009, Computer methods in biomechanics and biomedical engineering.

[17]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[18]  Yang Zhang,et al.  Tertiary structure predictions on a comprehensive benchmark of medium to large size proteins. , 2004, Biophysical journal.

[19]  Erich Bornberg-Bauer,et al.  Chain growth algorithms for HP-type lattice proteins , 1997, RECOMB '97.

[20]  Shing-Chung Ngan,et al.  PROTINFO: new algorithms for enhanced protein structure predictions , 2005, Nucleic Acids Res..

[21]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[22]  Y. Cui,et al.  Protein folding simulation with genetic algorithm and supersecondary structure constraints , 1998, Proteins.

[23]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[24]  Madhu Chetty,et al.  A Guided Genetic Algorithm for Protein Folding Prediction Using 3D Hydrophobic-Hydrophilic Model , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[25]  R Unger,et al.  Genetic algorithms for protein folding simulations. , 1992, Journal of molecular biology.

[26]  J Moult,et al.  Protein folding simulations with genetic algorithms and a detailed molecular description. , 1997, Journal of molecular biology.

[27]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[28]  Ron Unger,et al.  Genetic Algorithm for 3D Protein Folding Simulations , 1993, ICGA.

[29]  Vincenzo Cutello,et al.  An Immune Algorithm for Protein Structure Prediction on Lattice Models , 2007, IEEE Transactions on Evolutionary Computation.

[30]  Richard Bonneau,et al.  Rosetta in CASP4: Progress in ab initio protein structure prediction , 2001, Proteins.

[31]  Holger H. Hoos,et al.  An ant colony optimisation algorithm for the 2D and 3D hydrophobic polar protein folding problem , 2005, BMC Bioinformatics.