A Parallel Framework for Multipoint Spiral Search in ab Initio Protein Structure Prediction

Protein structure prediction is computationally a very challenging problem. A large number of existing search algorithms attempt to solve the problem by exploring possible structures and finding the one with the minimum free energy. However, these algorithms perform poorly on large sized proteins due to an astronomically wide search space. In this paper, we present a multipoint spiral search framework that uses parallel processing techniques to expedite exploration by starting from different points. In our approach, a set of random initial solutions are generated and distributed to different threads. We allow each thread to run for a predefined period of time. The improved solutions are stored threadwise. When the threads finish, the solutions are merged together and the duplicates are removed. A selected distinct set of solutions are then split to different threads again. In our ab initio protein structure prediction method, we use the three-dimensional face-centred-cubic lattice for structure-backbone mapping. We use both the low resolution hydrophobic-polar energy model and the high-resolution 20 × 20 energy model for search guiding. The experimental results show that our new parallel framework significantly improves the results obtained by the state-of-the-art single-point search approaches for both energy models on three-dimensional face-centred-cubic lattice. We also experimentally show the effectiveness of mixing energy models within parallel threads.

[1]  Abdul Sattar,et al.  Mixed Heuristic Local Search for Protein Structure Prediction , 2013, AAAI.

[2]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[3]  María S. Pérez-Hernández,et al.  Parallel Stochastic Search for Protein Secondary Structure Prediction , 2003, PPAM.

[4]  Abdul Sattar,et al.  Memory-based local search for simplified protein structure prediction , 2012, BCB.

[5]  E I Shakhnovich,et al.  A test of lattice protein folding algorithms. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Rabiah Ahmad,et al.  Communications in Computer and Information Science , 2010 .

[7]  Julio Ortega Lopera,et al.  Comparison of parallel multi-objective approaches to protein structure prediction , 2011, The Journal of Supercomputing.

[8]  T. Hales The Kepler conjecture , 1998, math/9811078.

[9]  Yang Zhang,et al.  The protein structure prediction problem could be solved using the current PDB library. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Z. Luthey-Schulten,et al.  Ab initio protein structure prediction. , 2002, Current opinion in structural biology.

[11]  El-Ghazali Talbi,et al.  A Comparative Study of Parallel Metaheuristics for Protein Structure Prediction on the Computational Grid , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[12]  Abdul Sattar,et al.  Random-walk: a stagnation recovery technique for simplified protein structure prediction , 2012, BCB '12.

[13]  Heitor Silvério Lopes,et al.  Parallel Artificial Bee Colony Algorithm Approaches for Protein Structure Prediction Using the 3DHP-SC Model , 2010, IDC.

[14]  Abdul Sattar,et al.  Protein folding prediction in 3D FCC HP lattice model using genetic algorithm , 2007, 2007 IEEE Congress on Evolutionary Computation.

[15]  Andrew Lewis,et al.  DFS-generated pathways in GA crossover for protein structure prediction , 2010, Neurocomputing.

[16]  Abdul Sattar,et al.  Simplified Lattice Models for Protein Structure Prediction: How Good Are They? , 2013, AAAI.

[17]  Pascal Van Hentenryck,et al.  On Lattice Protein Structure Prediction Revisited , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Abdul Sattar,et al.  A New Genetic Algorithm for Simplified Protein Structure Prediction , 2012, Australasian Conference on Artificial Intelligence.

[19]  Alessandro Dal Palù,et al.  Exploring Protein Fragment Assembly Using CLP , 2011, IJCAI.

[20]  Andrew Lewis,et al.  Twin Removal in Genetic Algorithms for Protein Structure Prediction Using Low-Resolution Model , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Rolf Backofen,et al.  A Constraint-Based Approach to Structure Prediction for Simplified Protein Models That Outperforms Other Existing Methods , 2003, ICLP.

[22]  Frank Thomson Leighton,et al.  Protein folding in the hydrophobic-hydrophilic (HP) is NP-complete , 1998, RECOMB '98.

[23]  Christian Blum,et al.  Ant colony optimization: Introduction and recent trends , 2005 .

[24]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[25]  Adam Smith Protein misfolding , 2003, Nature.

[26]  S. Colowick,et al.  Methods in Enzymology , Vol , 1966 .

[27]  Holger H. Hoos,et al.  A replica exchange Monte Carlo algorithm for protein folding in the HP model , 2007, BMC Bioinformatics.

[28]  Sebastian Will Constraint-Based Hydrophobic Core Construction for Protein Structure Prediction in the Face-Centered-Cubic Lattice , 2002, Pacific Symposium on Biocomputing.

[29]  Songde Ma,et al.  Protein folding simulations of the hydrophobic–hydrophilic model by combining tabu search with genetic algorithms , 2003 .

[30]  Yue,et al.  Sequence-structure relationships in proteins and copolymers. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[31]  Rolf Backofen,et al.  CPSP-web-tools: a server for 3D lattice protein studies , 2009, Bioinform..

[32]  Abdul Sattar,et al.  Spiral search: a hydrophobic-core directed local search for simplified PSP on 3D FCC lattice , 2013, BMC Bioinformatics.

[33]  Kathleen Steinhöfel,et al.  Protein Folding Simulation by Two-Stage Optimization , 2009 .

[34]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[35]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[36]  Alessandro Dal Palù,et al.  Constraint Logic Programming approach to protein structure prediction , 2004, BMC Bioinformatics.

[37]  Ron Unger,et al.  Genetic Algorithm for 3D Protein Folding Simulations , 1993, ICGA.

[38]  Vincenzo Cutello,et al.  An Immune Algorithm for Protein Structure Prediction on Lattice Models , 2007, IEEE Transactions on Evolutionary Computation.

[39]  C. Dobson Protein folding and misfolding , 2003, Nature.

[40]  Alessandro Dal Palù,et al.  A constraint solver for discrete lattices, its parallelization, and application to protein structure prediction , 2007, Softw. Pract. Exp..

[41]  Fred W. Glover,et al.  Tabu Search - Part I , 1989, INFORMS J. Comput..

[42]  Abdul Sattar,et al.  Mixing Energy Models in Genetic Algorithms for On-Lattice Protein Structure Prediction , 2013, BioMed research international.

[43]  Abdul Sattar,et al.  How Good Are Simplified Models for Protein Structure Prediction? , 2014, Adv. Bioinformatics.

[44]  Joe Marks,et al.  Human-guided tabu search , 2002, AAAI/IAAI.

[45]  Ivan Kondov Protein structure prediction using distributed parallel particle swarm optimization , 2012, Natural Computing.

[46]  El-Ghazali Talbi,et al.  A parallel hybrid genetic algorithm for protein structure prediction on the computational grid , 2007, Future Gener. Comput. Syst..

[47]  Richard Bonneau,et al.  Ab initio protein structure prediction: progress and prospects. , 2001, Annual review of biophysics and biomolecular structure.

[48]  R Samudrala,et al.  Ab initio construction of protein tertiary structures using a hierarchical approach. , 2000, Journal of molecular biology.

[49]  Fred Glover,et al.  Tabu Search - Part II , 1989, INFORMS J. Comput..

[50]  C. Levinthal Are there pathways for protein folding , 1968 .

[51]  Kathleen Steinhöfel,et al.  A hybrid approach to protein folding problem integrating constraint programming with local search , 2010, BMC Bioinformatics.

[52]  Alessandro Dal Palù,et al.  Heuristics, optimizations, and parallelism for protein structure prediction in CLP(FD) , 2005, PPDP '05.

[53]  Charles Seife,et al.  What Is the Universe Made Of? , 2005, Science.

[54]  Sebastian Will Exact, constraint-based structure prediction in simple protein models , 2005 .

[55]  Julio Ortega Lopera,et al.  Parallel Protein Structure Prediction by Multiobjective Optimization , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[56]  Rolf Backofen,et al.  CPSP-tools – Exact and complete algorithms for high-throughput 3D lattice protein studies , 2007, BMC Bioinformatics.

[57]  Federico Fogolari,et al.  Amino acid empirical contact energy definitions for fold recognition in the space of contact maps , 2003, BMC Bioinformatics.

[58]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[59]  Erik D. Goodman,et al.  A Standard GA Approach to Native Protein Conformation Prediction , 1995, ICGA.

[60]  Pascal Van Hentenryck,et al.  Protein Structure Prediction on the Face Centered Cubic Lattice by Local Search , 2008, AAAI.

[61]  Sue Whitesides,et al.  A complete and effective move set for simplified protein folding , 2003, RECOMB '03.

[62]  Hans-Joachim Böckenhauer,et al.  A Local Move Set for Protein Folding in Triangular Lattice Models , 2008, WABI.

[63]  El-Ghazali Talbi,et al.  A grid-based genetic algorithm combined with an adaptive simulated annealing for protein structure prediction , 2008, Soft Comput..

[64]  K. Dill,et al.  A lattice statistical mechanics model of the conformational and sequence spaces of proteins , 1989 .

[65]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[66]  Fernando Niño,et al.  A novel ab-initio genetic-based approach for protein folding prediction , 2007, GECCO '07.

[67]  Rolf Backofen,et al.  A Constraint-Based Approach to Fast and Exact Structure Prediction in Three-Dimensional Protein Models , 2006, Constraints.

[68]  So Much More to Know … , 2005, Science.

[69]  Kathleen Steinhöfel,et al.  Population-based local search for protein folding simulation in the MJ energy model and cubic lattices , 2009, Comput. Biol. Chem..