Efficient sampling of protein conformational space using fast loop building and batch minimization on highly parallel computers

All‐atom sampling is a critical and compute‐intensive end stage to protein structural modeling. Because of the vast size and extreme ruggedness of conformational space, even close to the native structure, the high‐resolution sampling problem is almost as difficult as predicting the rough fold of a protein. Here, we present a combination of new algorithms that considerably speed up the exploration of very rugged conformational landscapes and are capable of finding heretofore hidden low‐energy states. The algorithm is based on a hierarchical workflow and can be parallelized on supercomputers with up to 128,000 compute cores with near perfect efficiency. Such scaling behavior is notable, as with Moore's law continuing only in the number of cores per chip, parallelizability is a critical property of new algorithms. Using the enhanced sampling power, we have uncovered previously invisible deficiencies in the Rosetta force field and created an extensive decoy training set for optimizing and testing force fields. © 2012 Wiley Periodicals, Inc.

[1]  Y. Sugita,et al.  Replica-exchange molecular dynamics method for protein folding , 1999 .

[2]  H. Scheraga,et al.  Monte Carlo-minimization approach to the multiple-minima problem in protein folding. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[3]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[4]  Robert S. Germain,et al.  Blue Matter, an application framework for molecular simulation on Blue Gene , 2003, J. Parallel Distributed Comput..

[5]  Oliver F. Lange,et al.  Determination of the Structures of Symmetric Protein Oligomers from NMR Chemical Shifts and Residual Dipolar Couplings , 2011, Journal of the American Chemical Society.

[6]  D. A. Lidar,et al.  FRACTAL ANALYSIS OF PROTEIN POTENTIAL ENERGY LANDSCAPES , 1999 .

[7]  Oliver F. Lange,et al.  Determination of solution structures of proteins up to 40 kDa using CS-Rosetta with sparse NMR data from deuterated samples , 2012, Proceedings of the National Academy of Sciences.

[8]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..

[9]  E. Coutsias,et al.  Sub-angstrom accuracy in protein loop reconstruction by robotics-inspired conformational sampling , 2009, Nature Methods.

[10]  B. Berne,et al.  Novel methods of sampling phase space in the simulation of biological systems. , 1997, Current opinion in structural biology.

[11]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[12]  David Baker,et al.  Algorithm discovery by protein folding game players , 2011, Proceedings of the National Academy of Sciences.

[13]  J Moult,et al.  Genetic algorithms for protein structure prediction. , 1996, Current opinion in structural biology.

[14]  Markus Christen,et al.  On searching in, sampling of, and dynamically moving through conformational space of biomolecular systems: A review , 2008, J. Comput. Chem..

[15]  D. Baker,et al.  Refinement of protein structures into low-resolution density maps using rosetta. , 2009, Journal of molecular biology.

[16]  David J Wales,et al.  Refined kinetic transition networks for the GB1 hairpin peptide. , 2009, Physical chemistry chemical physics : PCCP.

[17]  Oliver F. Lange,et al.  Consistent blind protein structure generation from NMR chemical shift data , 2008, Proceedings of the National Academy of Sciences.

[18]  New optimization method for conformational energy calculations on polypeptides: Conformational space annealing , 1997 .

[19]  D. Baker,et al.  Alternate states of proteins revealed by detailed energy landscape mapping. , 2011, Journal of molecular biology.

[20]  Oliver F. Lange,et al.  Structure prediction for CASP8 with all‐atom refinement using Rosetta , 2009, Proteins.

[21]  Barry Robson,et al.  Novel algorithms for searching conformational space , 1994, J. Comput. Aided Mol. Des..

[22]  W. Wenzel,et al.  Stochastic Tunneling Approach for Global Minimization of Complex Potential Energy Landscapes , 1999 .

[23]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[24]  Carsten Kutzner,et al.  GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. , 2008, Journal of chemical theory and computation.

[25]  V. Pande,et al.  Absolute comparison of simulated and experimental protein-folding dynamics , 2002, Nature.

[26]  M. Levitt Protein folding by restrained energy minimization and molecular dynamics. , 1983, Journal of molecular biology.

[27]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[28]  P. Bradley,et al.  High-resolution structure prediction and the crystallographic phase problem , 2007, Nature.