Guided macro-mutation in a graded energy based genetic algorithm for protein structure prediction

Protein structure prediction is considered as one of the most challenging and computationally intractable combinatorial problem. Thus, the efficient modeling of convoluted search space, the clever use of energy functions, and more importantly, the use of effective sampling algorithms become crucial to address this problem. For protein structure modeling, an off-lattice model provides limited scopes to exercise and evaluate the algorithmic developments due to its astronomically large set of data-points. In contrast, an on-lattice model widens the scopes and permits studying the relatively larger proteins because of its finite set of data-points. In this work, we took the full advantage of an on-lattice model by using a face-centered-cube lattice that has the highest packing density with the maximum degree of freedom. We proposed a graded energy-strategically mixes the Miyazawa-Jernigan (MJ) energy with the hydrophobic-polar (HP) energy-based genetic algorithm (GA) for conformational search. In our application, we introduced a 2 × 2 HP energy guided macro-mutation operator within the GA to explore the best possible local changes exhaustively. Conversely, the 20 × 20 MJ energy model-the ultimate objective function of our GA that needs to be minimized-considers the impacts amongst the 20 different amino acids and allow searching the globally acceptable conformations. On a set of benchmark proteins, our proposed approach outperformed state-of-the-art approaches in terms of the free energy levels and the root-mean-square deviations.

[1]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[2]  A. H. Stouthamer A theoretical study on the amount of ATP required for synthesis of microbial cell material , 2007, Antonie van Leeuwenhoek.

[3]  Abdul Sattar,et al.  Mixed Heuristic Local Search for Protein Structure Prediction , 2013, AAAI.

[4]  Abdul Sattar,et al.  Memory-based local search for simplified protein structure prediction , 2012, BCB.

[5]  Islam Kamrul Memetic approach for prediction of low resolution protein structures using lattice models , 2017 .

[6]  Abdul Sattar,et al.  Collaborative Parallel Local Search for Simplified Protein Structure Prediction , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[7]  Rolf Backofen,et al.  CPSP-tools – Exact and complete algorithms for high-throughput 3D lattice protein studies , 2008, BMC Bioinformatics.

[8]  Federico Fogolari,et al.  Amino acid empirical contact energy definitions for fold recognition in the space of contact maps , 2003, BMC Bioinformatics.

[9]  Tamjidul Hoque,et al.  Applying Feature-Based Resampling to Protein Structure Prediction , 2012, BICoB 2012.

[10]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[11]  Madhu Chetty,et al.  Clustered memetic algorithm for protein structure prediction , 2010, IEEE Congress on Evolutionary Computation.

[12]  C. Levinthal Are there pathways for protein folding , 1968 .

[13]  Daniel J. Rigden,et al.  From Protein Structure to Function with Bioinformatics , 2009 .

[14]  Madhu Chetty,et al.  A Memetic Approach to Protein Structure Prediction in Triangular Lattices , 2011, ICONIP.

[15]  Abdul Sattar,et al.  Mixing Energy Models in Genetic Algorithms for On-Lattice Protein Structure Prediction , 2013, BioMed research international.

[16]  Madhu Chetty,et al.  Generalized Schemata Theorem Incorporating Twin Removal for Protein Structure Prediction , 2007, PRIB.

[17]  Joe Marks,et al.  Human-guided tabu search , 2002, AAAI/IAAI.

[18]  Fernando Niño,et al.  A novel ab-initio genetic-based approach for protein folding prediction , 2007, GECCO '07.

[19]  Adrien Treuille,et al.  Predicting protein structures with a multiplayer online game , 2010, Nature.

[20]  Edmund K. Burke,et al.  Multimeme Algorithms for Protein Structure Prediction , 2002, PPSN.

[21]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[22]  Abdul Sattar,et al.  A New Genetic Algorithm for Simplified Protein Structure Prediction , 2012, Australasian Conference on Artificial Intelligence.

[23]  Sitao Wu,et al.  Ab Initio Protein Structure Prediction , 2009 .

[24]  Yue,et al.  Sequence-structure relationships in proteins and copolymers. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[25]  Madhu Chetty,et al.  A new guided genetic algorithm for 2D hydrophobic-hydrophilic model to predict protein folding , 2005, 2005 IEEE Congress on Evolutionary Computation.

[26]  Alessandro Dal Palù,et al.  Exploring Protein Fragment Assembly Using CLP , 2011, IJCAI.

[27]  Laura Pozzi,et al.  An Effective Exact Algorithm and a New Upper Bound for the Number of Contacts in the Hydrophobic-Polar Two-Dimensional Lattice Model , 2013, J. Comput. Biol..

[28]  Pascal Van Hentenryck,et al.  Protein Structure Prediction on the Face Centered Cubic Lattice by Local Search , 2008, AAAI.

[29]  Abdul Sattar,et al.  Random-walk: a stagnation recovery technique for simplified protein structure prediction , 2012, BCB '12.

[30]  Sue Whitesides,et al.  A complete and effective move set for simplified protein folding , 2003, RECOMB '03.

[31]  Andrew Lewis,et al.  DFS-generated pathways in GA crossover for protein structure prediction , 2010, Neurocomputing.

[32]  D. Baker,et al.  Matching theory and experiment in protein folding. , 1999, Current opinion in structural biology.

[33]  Hans-Joachim Böckenhauer,et al.  A Local Move Set for Protein Folding in Triangular Lattice Models , 2008, WABI.

[34]  Abdul Sattar,et al.  Protein folding prediction in 3D FCC HP lattice model using genetic algorithm , 2007, 2007 IEEE Congress on Evolutionary Computation.

[35]  Andrew Lewis,et al.  Twin Removal in Genetic Algorithms for Protein Structure Prediction Using Low-Resolution Model , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[36]  Yang Zhang,et al.  Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field , 2012, Proteins.

[37]  Christian Blum,et al.  Ant colony optimization: Introduction and recent trends , 2005 .

[38]  Abdul Sattar,et al.  An Enhanced Genetic Algorithm for Ab Initio Protein Structure Prediction , 2016, IEEE Transactions on Evolutionary Computation.

[39]  Martin Raff,et al.  The Shape and Structure of Proteins , 2002 .

[40]  Abdul Sattar,et al.  A local search embedded genetic algorithm for simplified protein structure prediction , 2013, 2013 IEEE Congress on Evolutionary Computation.

[41]  C. Dobson Protein folding and misfolding , 2003, Nature.

[42]  D. Baker,et al.  A surprising simplicity to protein folding , 2000, Nature.

[43]  M. Manzur Murshed,et al.  Conflict Resolution Based Global Search Operators for Long Protein Structures Prediction , 2011, ICONIP.

[44]  Abdul Sattar,et al.  An efficient encoding for simplified protein structure prediction using genetic algorithms , 2013, 2013 IEEE Congress on Evolutionary Computation.

[45]  T. Hales The Kepler conjecture , 1998, math/9811078.

[46]  Alessandro Dal Palù,et al.  A constraint solver for discrete lattices, its parallelization, and application to protein structure prediction , 2007 .

[47]  Eleanor J. Dodson,et al.  Computational biology: Protein predictions , 2007, Nature.

[48]  Ivan Kondov,et al.  Protein structure prediction using particle swarm optimization and a distributed parallel approach , 2011, BADS '11.

[49]  Li Liao,et al.  Lattice models with asymmetric propensity matrices for locationally informed protein structure prediction , 2013, 2013 IEEE International Conference on Bioinformatics and Biomedicine.

[50]  Jacques M. Bahi,et al.  Computational investigations of folded self-avoiding walks related to protein folding , 2013, Comput. Biol. Chem..

[51]  Alessandro Dal Palù,et al.  A constraint solver for discrete lattices, its parallelization, and application to protein structure prediction , 2007, Softw. Pract. Exp..

[52]  Pascal Van Hentenryck,et al.  On Lattice Protein Structure Prediction Revisited , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[53]  Madhu Chetty,et al.  Non-Isomorphic Coding in Lattice Model and its Impact for Protein Folding Prediction Using Genetic Algorithm , 2006, 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology.

[54]  M. Manzur Murshed,et al.  Novel local improvement techniques in clustered memetic algorithm for protein structure prediction , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[55]  M. Loomes,et al.  A Firefly-Inspired Method for Protein Structure Prediction in Lattice Models , 2014, Biomolecules.

[56]  Adam Smith Protein misfolding , 2003, Nature.

[57]  Hoque Tamjidul Genetic algorithm for Ab initio protein structure prediction based on low resolution models , 2017 .

[58]  Sorin Istrail,et al.  Combinatorial Algorithms for Protein Folding in Lattice Models: A Survey of Mathematical Results , 2009, Commun. Inf. Syst..

[59]  Abdul Sattar,et al.  Refining Genetic Algorithm twin removal for high-resolution protein structure prediction , 2012, 2012 IEEE Congress on Evolutionary Computation.

[60]  David Baker,et al.  Macromolecular modeling with rosetta. , 2008, Annual review of biochemistry.

[61]  Ron Unger,et al.  Genetic Algorithm for 3D Protein Folding Simulations , 1993, ICGA.

[62]  Vincenzo Cutello,et al.  An Immune Algorithm for Protein Structure Prediction on Lattice Models , 2007, IEEE Transactions on Evolutionary Computation.

[63]  M. Zeldin Heuristics! , 2010 .

[64]  Kathleen Steinhöfel,et al.  A hybrid approach to protein folding problem integrating constraint programming with local search , 2010, BMC Bioinformatics.

[65]  Madhu Chetty,et al.  Novel Memetic Algorithm for Protein Structure Prediction , 2009, Australasian Conference on Artificial Intelligence.

[66]  Alessandro Dal Palù,et al.  Heuristics, optimizations, and parallelism for protein structure prediction in CLP(FD) , 2005, PPDP '05.

[67]  El-Ghazali Talbi,et al.  A grid-based genetic algorithm combined with an adaptive simulated annealing for protein structure prediction , 2008, Soft Comput..

[68]  Sumaiya Iqbal,et al.  Improved prediction of accessible surface area results in efficient energy function application. , 2015, Journal of theoretical biology.

[69]  Nashat Mansour,et al.  Protein structure prediction in the 3D HP model , 2009, 2009 IEEE/ACS International Conference on Computer Systems and Applications.

[70]  Kathleen Steinhöfel,et al.  Population-based local search for protein folding simulation in the MJ energy model and cubic lattices , 2009, Comput. Biol. Chem..

[71]  Natalio Krasnogor,et al.  Multimeme Algorithms Using Fuzzy Logic Based Memes For Protein Structure Prediction , 2005 .

[72]  Samuel L. DeLuca,et al.  Practically Useful: What the Rosetta Protein Modeling Suite Can Do for You , 2010, Biochemistry.

[73]  Madhu Chetty,et al.  Clustered Memetic Algorithm With Local Heuristics for Ab Initio Protein Structure Prediction , 2013, IEEE Transactions on Evolutionary Computation.

[74]  H. Morowitz Energy Flow in Biology. , 1969 .

[75]  R Unger,et al.  Genetic algorithms for protein folding simulations. , 1992, Journal of molecular biology.

[76]  Abdul Sattar,et al.  Spiral search: a hydrophobic-core directed local search for simplified PSP on 3D FCC lattice , 2013, BMC Bioinformatics.

[77]  Erich Bornberg-Bauer,et al.  Comparing folding codes in simple heteropolymer models of protein evolutionary landscape: robustness of the superfunnel paradigm. , 2005, Biophysical journal.

[78]  Kathleen Steinhöfel,et al.  Protein Folding Simulation by Two-Stage Optimization , 2009 .

[79]  K. Dill,et al.  A lattice statistical mechanics model of the conformational and sequence spaces of proteins , 1989 .

[80]  Holger H. Hoos,et al.  A replica exchange Monte Carlo algorithm for protein folding in the HP model , 2007, BMC Bioinformatics.

[81]  Alessandro Dal Palù,et al.  Constraint Logic Programming approach to protein structure prediction , 2004, BMC Bioinformatics.