Using self-consistent fields to bias Monte Carlo methods with applications to designing and sampling protein sequences

For complex multidimensional systems, Monte Carlo methods are useful for sampling probable regions of a configuration space and, in the context of annealing, for determining “low energy” or “high scoring” configurations. Such methods have been used in protein design as means to identify amino acid sequences that are energetically compatible with a particular backbone structure. As with many other applications of Monte Carlo methods, such searches can be inefficient if trial configurations (protein sequences) in the Markov chain are chosen randomly. Here a mean-field biased Monte Carlo method (MFBMC) is presented and applied to designing and sampling protein sequences. The MFBMC method uses predetermined sequence identity probabilities wi(α) to bias the sequence selection. The wi(α) are calculated using a self-consistent, mean-field theory that can estimate the number and composition of sequences having predetermined values of energetically related foldability criteria. The MFBMC method is applied to both ...

[1]  P. Wolynes,et al.  Optimal protein-folding codes from spin-glass theory. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[2]  J G Saven,et al.  Statistical theory for protein combinatorial libraries. Packing interactions, backbone flexibility, and the sequence variability of a main-chain structure. , 2001, Journal of molecular biology.

[3]  Roland L. Dunbrack,et al.  Bayesian statistical analysis of protein side‐chain rotamer preferences , 1997, Protein science : a publication of the Protein Society.

[4]  Seno,et al.  Optimal Protein Design Procedure. , 1996, Physical review letters.

[5]  M. Karplus,et al.  Enhanced sampling in molecular dynamics: use of the time-dependent Hartree approximation for a simulation of carbon monoxide diffusion through myoglobin , 1990 .

[6]  Berend Smit,et al.  Understanding Molecular Simulation , 2001 .

[7]  M. Karplus,et al.  Kinetics of protein folding. A lattice model study of the requirements for folding to the native state. , 1994, Journal of molecular biology.

[8]  Christopher A. Voigt,et al.  Trading accuracy for speed: A quantitative comparison of search algorithms in protein sequence design. , 2000, Journal of molecular biology.

[9]  Maximiliano Vásquez,et al.  An evaluvation of discrete and continuum search techniques for conformational analysis of side chains in proteins , 1995 .

[10]  A Irbäck,et al.  Design of sequences with good folding properties in coarse-grained protein models. , 1999, Structure.

[11]  E. Shakhnovich,et al.  Statistical mechanics of proteins with "evolutionary selected" sequences. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[12]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.

[13]  Athanassios Siapas,et al.  Criticality and Parallelism in Combinatorial Optimization , 1996, Science.

[14]  J. Ponder,et al.  Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. , 1987, Journal of molecular biology.

[15]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[16]  K. Dill,et al.  Designing amino acid sequences to fold with good hydrophobic cores. , 1995, Protein engineering.

[17]  S L Mayo,et al.  De novo protein design: towards fully automated sequence selection. , 1997, Journal of molecular biology.

[18]  Daan Frenkel,et al.  Configurational bias Monte Carlo: a new sampling scheme for flexible chains , 1992 .

[19]  K. Dill,et al.  The effects of internal constraints on the configurations of chain molecules , 1990 .

[20]  J G Saven,et al.  Statistical theory of combinatorial libraries of folding proteins: energetic discrimination of a target structure. , 2000, Journal of molecular biology.

[21]  A. W. Rosenbluth,et al.  MONTE CARLO CALCULATION OF THE AVERAGE EXTENSION OF MOLECULAR CHAINS , 1955 .

[22]  Jeffery G. Saven,et al.  STATISTICAL MECHANICS OF THE COMBINATORIAL SYNTHESIS AND ANALYSIS OF FOLDING MACROMOLECULES , 1997 .

[23]  David T. Jones,et al.  De novo protein design using pairwise potentials and a genetic algorithm , 1994, Protein science : a publication of the Protein Society.

[24]  Stephen L. Mayo,et al.  Designing protein β-sheet surfaces by Z-score optimization , 2000 .

[25]  Ramy Farid,et al.  A de Novo Designed Protein with Properties That Characterize Natural Hyperthermophilic Proteins , 1997 .

[26]  S. L. Mayo,et al.  De novo protein design: fully automated sequence selection. , 1997, Science.

[27]  Deutsch,et al.  New algorithm for protein design. , 1995, Physical review letters.

[28]  E. Shakhnovich,et al.  A new approach to the design of stable proteins. , 1993, Protein engineering.

[29]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[30]  J. Onuchic,et al.  Theory of protein folding: the energy landscape perspective. , 1997, Annual review of physical chemistry.

[31]  E I Shakhnovich,et al.  Design of proteins with selected thermal properties. , 1996, Folding & design.

[32]  E I Shakhnovich,et al.  A test of lattice protein folding algorithms. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[33]  J G Saven,et al.  Designing protein energy landscapes. , 2001, Chemical reviews.

[34]  E I Shakhnovich,et al.  Improved design of stable and fast-folding model proteins. , 1996, Folding & design.

[35]  J R Desjarlais,et al.  De novo design of the hydrophobic cores of proteins , 1995, Protein science : a publication of the Protein Society.

[36]  J. Onuchic,et al.  Protein folding funnels: a kinetic approach to the sequence-structure relationship. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[37]  E I Shakhnovich,et al.  Modeling protein folding: the beauty and power of simplicity. , 1996, Folding & design.

[38]  A. Godzik,et al.  Derivation and testing of pair potentials for protein folding. When is the quasichemical approximation correct? , 1997, Protein science : a publication of the Protein Society.

[39]  J. Ilja Siepmann,et al.  Monte carlo methods in chemical physics , 1999 .

[40]  Andrew E. Torda,et al.  Biased Monte Carlo optimization of protein sequences , 2000 .

[41]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[42]  M. A. Cayless Statistical Mechanics (2nd edn) , 1977 .

[43]  S. L. Mayo,et al.  Protein design automation , 1996, Protein science : a publication of the Protein Society.

[44]  V S Pande,et al.  Nonrandomness in protein sequences: evidence for a physically driven stage of evolution? , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Stephen L. Mayo,et al.  Rubredoxin Variant Folds without Iron , 1999 .

[46]  W. DeGrado,et al.  Protein Design: A Hierarchic Approach , 1995, Science.

[47]  W. DeGrado,et al.  Solution structure and dynamics of a de novo designed three-helix bundle protein. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Eugene I. Shakhnovich,et al.  Enumeration of all compact conformations of copolymers with random sequence of links , 1990 .

[49]  P. S. Kim,et al.  High-resolution protein design with backbone freedom. , 1998, Science.

[50]  P Koehl,et al.  Mean-field minimization methods for biological macromolecules. , 1996, Current opinion in structural biology.

[51]  P. Koehl,et al.  Application of a self-consistent mean field theory to predict protein side-chains conformation and estimate their conformational entropy. , 1994, Journal of molecular biology.

[52]  N. Wingreen,et al.  Emergence of Preferred Structures in a Simple Model of Protein Folding , 1996, Science.

[53]  Erik Sandelin,et al.  Monte Carlo procedure for protein design , 1997, cond-mat/9711092.

[54]  P. Kollman,et al.  An all atom force field for simulations of proteins and nucleic acids , 1986, Journal of computational chemistry.

[55]  E I Shakhnovich,et al.  Protein design: a perspective from simple tractable models , 1998, Folding & design.

[56]  J. Mendes,et al.  Improvement of side-chain modeling in proteins with the self-consistent mean field theory method based on an analysis of the factors influencing prediction. , 1999, Biopolymers.

[57]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[58]  C. Lee,et al.  Predicting protein mutant energetics by self-consistent ensemble optimization. , 1994, Journal of molecular biology.

[59]  S. Forsén,et al.  Proline cis-trans isomers in calbindin D9k observed by X-ray crystallography. , 1992, Journal of Molecular Biology.

[60]  Roland L. Dunbrack,et al.  Backbone-dependent rotamer library for proteins. Application to side-chain prediction. , 1993, Journal of molecular biology.

[61]  F M Richards,et al.  Optimal sequence selection in proteins of known structure by simulated evolution. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[62]  W Nadler,et al.  Protein folding: optimized sequences obtained by simulated breeding in a minimalist model. , 1997, Biopolymers.

[63]  Hidetoschi Kono,et al.  A new method for side‐chain conformation prediction using a Hopfield network and reproduced rotamers , 1996 .

[64]  Alexei V. Finkelstein,et al.  A search for the most stable folds of protein chains , 1991, Nature.