Validating a Coarse-Grained Potential Energy Function through Protein Loop Modelling

Coarse-grained (CG) methods for sampling protein conformational space have the potential to increase computational efficiency by reducing the degrees of freedom. The gain in computational efficiency of CG methods often comes at the expense of non-protein like local conformational features. This could cause problems when transitioning to full atom models in a hierarchical framework. Here, a CG potential energy function was validated by applying it to the problem of loop prediction. A novel method to sample the conformational space of backbone atoms was benchmarked using a standard test set consisting of 351 distinct loops. This method used a sequence-independent CG potential energy function representing the protein using -carbon positions only and sampling conformations with a Monte Carlo simulated annealing based protocol. Backbone atoms were added using a method previously described and then gradient minimised in the Rosetta force field. Despite the CG potential energy function being sequence-independent, the method performed similarly to methods that explicitly use either fragments of known protein backbones with similar sequences or residue-specific /-maps to restrict the search space. The method was also able to predict with sub-Angstrom accuracy two out of seven loops from recently solved crystal structures of proteins with low sequence and structure similarity to previously deposited structures in the PDB. The ability to sample realistic loop conformations directly from a potential energy function enables the incorporation of additional geometric restraints and the use of more advanced sampling methods in a way that is not possible to do easily with fragment replacement methods and also enable multi-scale simulations for protein design and protein structure prediction. These restraints could be derived from experimental data or could be design restraints in the case of computational protein design. C++ source code is available for download from http://www.sbg.bio.ic.ac.uk/phyre2/PD2/.

[1]  Michael Feig,et al.  Conformational Sampling in Structure Prediction and Refinement with Atomistic and Coarse-Grained Models , 2011 .

[2]  D. Baker,et al.  An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. , 2003, Journal of molecular biology.

[3]  Thomas A. Hopf,et al.  Protein 3D Structure Computed from Evolutionary Sequence Variation , 2011, PloS one.

[4]  C A Floudas,et al.  Protein loop structure prediction with flexible stem geometries , 2005, Proteins.

[5]  M. Sternberg,et al.  Protein structure prediction on the Web: a case study using the Phyre server , 2009, Nature Protocols.

[6]  Chaok Seok,et al.  Protein loop modeling by using fragment assembly and analytical loop closure , 2010, Proteins.

[7]  A. Liwo,et al.  A united‐residue force field for off‐lattice protein‐structure simulations. I. Functional forms and parameters of long‐range side‐chain interaction potentials from protein crystal data , 1997 .

[8]  Alexander D. MacKerell,et al.  Optimization of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone φ, ψ and side-chain χ(1) and χ(2) dihedral angles. , 2012, Journal of chemical theory and computation.

[9]  Stewart A. Adcock Peptide backbone reconstruction using dead‐end elimination and a knowledge‐based forcefield , 2004, J. Comput. Chem..

[10]  An-Suei Yang,et al.  Modeling protein loops with knowledge-based prediction of sequence-structure alignment , 2007, Bioinform..

[11]  Benjamin R. Jefferys,et al.  Protein Folding Requires Crowd Control in a Simulated Cell , 2010, Journal of molecular biology.

[12]  P. Argos,et al.  Knowledge‐based protein secondary structure assignment , 1995, Proteins.

[13]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[14]  N. Go,et al.  Studies on protein folding, unfolding and fluctuations by computer simulation. I. The effect of specific amino acid sequence represented by specific inter-unit interactions. , 2009 .

[15]  E. Coutsias,et al.  Sub-angstrom accuracy in protein loop reconstruction by robotics-inspired conformational sampling , 2009, Nature Methods.

[16]  Yang Zhang,et al.  Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field , 2012, Proteins.

[17]  A. Laio,et al.  Escaping free-energy minima , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[18]  R. Larson,et al.  The MARTINI Coarse-Grained Force Field: Extension to Proteins. , 2008, Journal of chemical theory and computation.

[19]  Mariusz Milik,et al.  Algorithm for rapid reconstruction of protein backbone from alpha carbon coordinates , 1997, J. Comput. Chem..

[20]  Andrzej Kolinski,et al.  Modeling of loops in proteins: a multi-method approach , 2010, BMC Structural Biology.

[21]  Klaus Schulten,et al.  Coarse-grained molecular dynamics simulations of a rotating bacterial flagellum. , 2006, Biophysical journal.

[22]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[23]  Z. Popovic,et al.  Increased Diels-Alderase activity through backbone remodeling guided by Foldit players , 2012, Nature Biotechnology.

[24]  Adam Liwo,et al.  A united-residue force field for off-lattice protein-structure simulations. I. Functional forms and parameters of long-range side-chain interaction potentials from protein crystal data , 1997, J. Comput. Chem..

[25]  G. Torrie,et al.  Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling , 1977 .

[26]  S. Buldyrev,et al.  Folding Trp-cage to NMR resolution native structure using a coarse-grained protein model. , 2004, Biophysical journal.

[27]  Michael Feig,et al.  Is Alanine Dipeptide a Good Model for Representing the Torsional Preferences of Protein Backbones? , 2008, Journal of chemical theory and computation.

[28]  David Baker,et al.  Algorithm discovery by protein folding game players , 2011, Proceedings of the National Academy of Sciences.

[29]  David T. Jones Successful ab initio prediction of the tertiary structure of NK‐lysin using multiple sequences and recognized supersecondary structural motifs , 1997, Proteins.

[30]  M. Levitt A simplified representation of protein conformations for rapid simulation of protein folding. , 1976, Journal of molecular biology.

[31]  William R Taylor,et al.  De novo backbone scaffolds for protein design , 2009, Proteins.

[32]  Roland L. Dunbrack,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improved prediction of protein side-chain conformations with SCWRL4 , 2022 .

[33]  David E. Kim,et al.  Sampling bottlenecks in de novo protein structure prediction. , 2009, Journal of molecular biology.

[34]  I. Coluzza A Coarse-Grained Approach to Protein Design: Learning from Design to Understand Folding , 2011, PloS one.

[35]  B. Honig,et al.  A hierarchical approach to all‐atom protein loop prediction , 2004, Proteins.

[36]  A. Kolinski Protein modeling and structure prediction with a reduced representation. , 2004, Acta biochimica Polonica.

[37]  Shoji Takada,et al.  A Reversible Fragment Assembly Method for De Novo Protein Structure Prediction , 2003 .

[38]  A. Sali,et al.  Modeling of loops in protein structures , 2000, Protein science : a publication of the Protein Society.

[39]  U. Singh,et al.  A NEW FORCE FIELD FOR MOLECULAR MECHANICAL SIMULATION OF NUCLEIC ACIDS AND PROTEINS , 1984 .

[40]  Sergey Lyskov,et al.  PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta , 2010, Bioinform..

[41]  N. Go,et al.  Studies on protein folding, unfolding and fluctuations by computer simulation. III. Effect of short-range interactions. , 2009, International journal of peptide and protein research.

[42]  T. Head-Gordon,et al.  Minimalist models for protein folding and design. , 2003, Current opinion in structural biology.

[43]  M. DePristo,et al.  Ab initio construction of polypeptide fragments: Efficient generation of accurate, representative ensembles , 2003, Proteins.

[44]  Roland L. Dunbrack Rotamer libraries in the 21st century. , 2002, Current opinion in structural biology.

[45]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[46]  Dominik Gront,et al.  Backbone building from quadrilaterals: A fast and accurate algorithm for protein backbone reconstruction from alpha carbon coordinates , 2007, J. Comput. Chem..

[47]  Yoonjoo Choi,et al.  FREAD revisited: Accurate loop structure prediction using a database search algorithm , 2010, Proteins.

[48]  D. Yee,et al.  Principles of protein folding — A perspective from simple exact models , 1995, Protein science : a publication of the Protein Society.

[49]  M. DePristo,et al.  Ab initio construction of polypeptide fragments: Accuracy of loop decoy discrimination by an all‐atom statistical potential and the AMBER force field with the Generalized Born solvation model , 2003, Proteins.

[50]  J. Skolnick,et al.  MONSSTER: a method for folding globular proteins with a small number of distance restraints. , 1997, Journal of molecular biology.

[51]  C. Sander,et al.  Database algorithm for generating protein backbone and side-chain co-ordinates from a C alpha trace application to model building and detection of co-ordinate errors. , 1991, Journal of molecular biology.

[52]  Cinque S. Soto,et al.  Evaluating conformational free energies: The colony energy and its application to the problem of loop prediction , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[53]  Kai Zhu,et al.  Toward better refinement of comparative models: Predicting loops in inexact environments , 2008, Proteins.

[54]  Brian Kuhlman,et al.  High-resolution design of a protein loop , 2007, Proceedings of the National Academy of Sciences.

[55]  R. Friesner,et al.  Long loop prediction using the protein local optimization program , 2006, Proteins.

[56]  Kam Y. J. Zhang,et al.  Accurate computer-based design of a new backbone conformation in the second turn of protein L. , 2002, Journal of molecular biology.

[57]  Shayantani Mukherjee,et al.  PRIMO/PRIMONA: A coarse‐grained model for proteins and nucleic acids that preserves near‐atomistic accuracy , 2010, Proteins.

[58]  Valentina Tozzini,et al.  Coarse-grained models for proteins. , 2005, Current opinion in structural biology.

[59]  J. Skolnick,et al.  TOUCHSTONE II: a new approach to ab initio protein structure prediction. , 2003, Biophysical journal.

[60]  Barry Honig,et al.  Loop modeling: Sampling, filtering, and scoring , 2007, Proteins.

[61]  Cecilia Clementi,et al.  Coarse-grained models of protein folding: toy models or predictive tools? , 2008, Current opinion in structural biology.

[62]  D. Baker,et al.  Modeling structurally variable regions in homologous proteins with rosetta , 2004, Proteins.

[63]  Alessandro Pandini,et al.  Structural alphabets derived from attractors in conformational space , 2010, BMC Bioinformatics.

[64]  D. Tieleman,et al.  The MARTINI force field: coarse grained model for biomolecular simulations. , 2007, The journal of physical chemistry. B.

[65]  Osvaldo Olmea,et al.  MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison , 2002, Protein science : a publication of the Protein Society.