Exploring the energy landscapes of protein folding simulations with Bayesian computation.

Nested sampling is a Bayesian sampling technique developed to explore probability distributions localized in an exponentially small area of the parameter space. The algorithm provides both posterior samples and an estimate of the evidence (marginal likelihood) of the model. The nested sampling algorithm also provides an efficient way to calculate free energies and the expectation value of thermodynamic observables at any temperature, through a simple post processing of the output. Previous applications of the algorithm have yielded large efficiency gains over other sampling techniques, including parallel tempering. In this article, we describe a parallel implementation of the nested sampling algorithm and its application to the problem of protein folding in a Gō-like force field of empirical potentials that were designed to stabilize secondary structure elements in room-temperature simulations. We demonstrate the method by conducting folding simulations on a number of small proteins that are commonly used for testing protein-folding procedures. A topological analysis of the posterior samples is performed to produce energy landscape charts, which give a high-level description of the potential energy surface for the protein folding simulations. These charts provide qualitative insights into both the folding process and the nature of the model and force field used.

[1]  R. L. Baldwin,et al.  Origin of the different strengths of the (i,i+4) and (i,i+3) leucine pair interactions in helices. , 2002, Biophysical chemistry.

[2]  J. Onuchic,et al.  Theory of protein folding: the energy landscape perspective. , 1997, Annual review of physical chemistry.

[3]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[4]  E. Baker,et al.  Hydrogen bonding in globular proteins. , 1984, Progress in biophysics and molecular biology.

[5]  P. Wolynes,et al.  Spin glasses and the statistical mechanics of protein folding. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[6]  J. Richardson,et al.  Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. , 1999, Journal of molecular biology.

[7]  D. Kern,et al.  Dynamic personalities of proteins , 2007, Nature.

[8]  M. Thorpe,et al.  Constrained geometric simulation of diffusive motion in proteins , 2005, Physical biology.

[9]  M. Moorhouse,et al.  The Protein Databank , 2005 .

[10]  Nancy M. Amato,et al.  Using motion planning to map protein folding landscapes and analyze folding kinetics of known native structures , 2002, RECOMB '02.

[11]  Wang,et al.  Replica Monte Carlo simulation of spin glasses. , 1986, Physical review letters.

[12]  Julius Jellinek,et al.  Energy Landscapes: With Applications to Clusters, Biomolecules and Glasses , 2005 .

[13]  R. Huber,et al.  Accurate Bond and Angle Parameters for X-ray Protein Structure Refinement , 1991 .

[14]  M. Karplus,et al.  The topology of multidimensional potential energy surfaces: Theory and application to peptide structure and kinetics , 1997 .

[15]  R. Srinivasan,et al.  The Flory isolated-pair hypothesis is not valid for polypeptide chains: implications for protein folding. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[16]  David L Wild,et al.  Reconstruction and stability of secondary structure elements in the context of protein structure prediction. , 2009, Biophysical journal.

[17]  B. Lee,et al.  The interpretation of protein structures: estimation of static accessibility. , 1971, Journal of molecular biology.

[18]  A. J. Hopfinger,et al.  Conformational Properties of Macromolecules , 1973 .

[19]  Leslie A. Kuhn,et al.  Flexible and Rigid Regions in Proteins , 2002 .

[20]  David J Wales,et al.  Potential energy and free energy landscapes. , 2006, The journal of physical chemistry. B.

[21]  D. Parkinson,et al.  A Nested Sampling Algorithm for Cosmological Model Selection , 2005, astro-ph/0508461.

[22]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[23]  Zoubin Ghahramani,et al.  Learning about protein hydrogen bonding by minimizing contrastive divergence , 2006, Proteins.

[24]  David L Wild,et al.  Exhaustive Metropolis Monte Carlo sampling and analysis of polyalanine conformations adopted under the influence of hydrogen bonds , 2005, Proteins.

[25]  Shoji Takada,et al.  Folding energy landscape and network dynamics of small globular proteins , 2009, Proceedings of the National Academy of Sciences.

[26]  A. Fersht,et al.  Protein Folding and Unfolding at Atomic Resolution , 2002, Cell.

[27]  Lorna J. Smith,et al.  Understanding protein folding via free-energy surfaces from theory and experiment. , 2000, Trends in biochemical sciences.

[28]  S. Takada,et al.  Go-ing for the prediction of protein folding mechanisms. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[29]  K. Dill,et al.  The flexibility in the proline ring couples to the protein backbone , 2005, Protein science : a publication of the Protein Society.

[30]  Gábor Csányi,et al.  Efficient sampling of atomic configurational spaces. , 2009, The journal of physical chemistry. B.

[31]  J. Skolnick,et al.  Reduced models of proteins and their applications , 2004 .

[32]  John Karanicolas,et al.  The origins of asymmetry in the folding transition states of protein L and protein G , 2002, Protein science : a publication of the Protein Society.

[33]  G. Rose,et al.  Is protein folding hierarchic? I. Local structure and peptide folding. , 1999, Trends in biochemical sciences.

[34]  Alan M. Ferrenberg,et al.  New Monte Carlo technique for studying phase transitions. , 1988, Physical review letters.

[35]  M. Zalis,et al.  Visualizing and quantifying molecular goodness-of-fit: small-probe contact dots with explicit hydrogen atoms. , 1999, Journal of molecular biology.

[36]  Axel T. Brunger,et al.  X-PLOR Version 3.1: A System for X-ray Crystallography and NMR , 1992 .

[37]  J. Onuchic,et al.  Interplay among tertiary contacts, secondary structure formation and side-chain packing in the protein folding mechanism: all-atom representation study of protein L. , 2003, Journal of molecular biology.

[38]  G. Rose,et al.  Is protein folding hierarchic? II. Folding intermediates and transition states. , 1999, Trends in biochemical sciences.

[39]  R. Srinivasan,et al.  A physical basis for protein secondary structure. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[40]  K. Dill,et al.  Hydrogen bonding in globular proteins. , 1992, Journal of molecular biology.

[41]  David L. Wild,et al.  CRANKITE: A fast polypeptide backbone conformation sampler , 2008, Source Code for Biology and Medicine.

[42]  Charles H. Bennett,et al.  Efficient estimation of free energy differences from Monte Carlo data , 1976 .

[43]  J. Thornton,et al.  Satisfying hydrogen bonding potential in proteins. , 1994, Journal of molecular biology.

[44]  C L Brooks,et al.  Calculations on folding of segment B1 of streptococcal protein G. , 1998, Journal of molecular biology.

[45]  E. Shakhnovich,et al.  The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[46]  C. M. Freeman,et al.  Lost hydrogen bonds and buried surface area: rationalising stability in globular proteins , 1993 .

[47]  George D. Rose,et al.  Interactions between hydrophobic side chains within α‐helices , 1995 .

[48]  Zoubin Ghahramani,et al.  Nested sampling for Potts models , 2005, NIPS.

[49]  Patrice Koehl,et al.  The ASTRAL compendium for protein structure and sequence analysis , 2000, Nucleic Acids Res..

[50]  Robert Huber,et al.  Structure quality and target parameters , 2006 .

[51]  Karsten Suhre,et al.  ElNémo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement , 2004, Nucleic Acids Res..

[52]  B. L. Sibanda,et al.  Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. , 1989, Journal of molecular biology.

[53]  D. Jacobs,et al.  Protein flexibility and dynamics using constraint theory. , 2001, Journal of molecular graphics & modelling.

[54]  G. N. Ramachandran,et al.  Stereochemistry of polypeptide chain configurations. , 1963, Journal of molecular biology.

[55]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[56]  R. Dror,et al.  How Fast-Folding Proteins Fold , 2011, Science.

[57]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[58]  D. Landau,et al.  Efficient, multiple-range random walk algorithm to calculate the density of states. , 2000, Physical review letters.

[59]  J. Skilling Nested sampling for general Bayesian computation , 2006 .

[60]  A. Elofsson,et al.  Local moves: An efficient algorithm for simulation of protein folding , 1995, Proteins.

[61]  N. Go Theoretical studies of protein folding. , 1983, Annual review of biophysics and bioengineering.

[62]  John Skilling,et al.  Data analysis : a Bayesian tutorial , 1996 .

[63]  M. Karplus,et al.  How does a protein fold? , 1994, Nature.

[64]  Alan M. Ferrenberg,et al.  Optimized Monte Carlo data analysis. , 1989, Physical Review Letters.