Assessment of the utility of contact‐based restraints in accelerating the prediction of protein structure using molecular dynamics simulations

Molecular dynamics (MD) simulation is a well‐established tool for the computational study of protein structure and dynamics, but its application to the important problem of protein structure prediction remains challenging, in part because extremely long timescales can be required to reach the native structure. Here, we examine the extent to which the use of low‐resolution information in the form of residue–residue contacts, which can often be inferred from bioinformatics or experimental studies, can accelerate the determination of protein structure in simulation. We incorporated sets of 62, 31, or 15 contact‐based restraints in MD simulations of ubiquitin, a benchmark system known to fold to the native state on the millisecond timescale in unrestrained simulations. One‐third of the restrained simulations folded to the native state within a few tens of microseconds—a speedup of over an order of magnitude compared with unrestrained simulations and a demonstration of the potential for limited amounts of structural information to accelerate structure determination. Almost all of the remaining ubiquitin simulations reached near‐native conformations within a few tens of microseconds, but remained trapped there, apparently due to the restraints. We discuss potential methodological improvements that would facilitate escape from these near‐native traps and allow more simulations to quickly reach the native state. Finally, using a target from the Critical Assessment of protein Structure Prediction (CASP) experiment, we show that distance restraints can improve simulation accuracy: In our simulations, restraints stabilized the native state of the protein, enabling a reasonable structural model to be inferred.

[1]  M. Karplus,et al.  Dynamics of folded proteins , 1977, Nature.

[2]  W. L. Jorgensen,et al.  Comparison of simple potential functions for simulating liquid water , 1983 .

[3]  S. Nosé A unified formulation of the constant temperature molecular dynamics methods , 1984 .

[4]  Hoover,et al.  Canonical dynamics: Equilibrium phase-space distributions. , 1985, Physical review. A, General physics.

[5]  H. Roder,et al.  Early hydrogen-bonding events in the folding reaction of ubiquitin. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[6]  M. Klein,et al.  Nosé-Hoover chains : the canonical ensemble via continuous dynamics , 1992 .

[7]  A. Lyubartsev,et al.  New approach to Monte Carlo calculation of the free energy: Method of expanded ensembles , 1992 .

[8]  G. Parisi,et al.  Simulated tempering: a new Monte Carlo scheme , 1992, hep-lat/9205018.

[9]  R. Levy,et al.  Global folding of proteins using a limited number of distance constraints. , 1993, Protein engineering.

[10]  S. Khorasanizadeh,et al.  Folding and stability of a tryptophan-containing mutant of ubiquitin. , 1993, Biochemistry.

[11]  A relaxation-matrix analysis of distance-constraint ranges for NOEs in proteins at long mixing times. , 1995, Journal of magnetic resonance. Series B.

[12]  S. Khorasanizadeh,et al.  Evidence for a three-state model of protein folding from kinetic analysis of ubiquitin variants with altered core residues , 1996, Nature Structural Biology.

[13]  P. Kollman,et al.  Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution. , 1998, Science.

[14]  Ad Bax,et al.  Validation of Protein Structure from Anisotropic Carbonyl Chemical Shifts in a Dilute Liquid Crystalline Phase , 1998 .

[15]  Alexander D. MacKerell,et al.  All-atom empirical potential for molecular modeling and dynamics studies of proteins. , 1998, The journal of physical chemistry. B.

[16]  Y. Sugita,et al.  Replica-exchange molecular dynamics method for protein folding , 1999 .

[17]  T. Sosnick,et al.  Distinguishing between two-state and three-state models for ubiquitin folding. , 2000, Biochemistry.

[18]  D. Baker,et al.  De novo protein structure determination using sparse NMR data , 2000, Journal of biomolecular NMR.

[19]  Berend Smit,et al.  Understanding Molecular Simulation , 2001 .

[20]  D. Baker,et al.  Molecular dynamics in the endgame of protein structure prediction. , 2001, Journal of molecular biology.

[21]  D. Baker,et al.  De novo determination of protein backbone structure from residual dipolar couplings using Rosetta. , 2002, Journal of the American Chemical Society.

[22]  Michael R. Shirts,et al.  Simulation of folding of a small alpha-helical protein in atomistic detail using worldwide-distributed computing. , 2002, Journal of molecular biology.

[23]  V. Pande,et al.  Absolute comparison of simulated and experimental protein-folding dynamics , 2002, Nature.

[24]  J. Skolnick,et al.  TOUCHSTONE II: a new approach to ab initio protein structure prediction. , 2003, Biophysical journal.

[25]  D. Baker,et al.  Rapid protein fold determination using unassigned NMR data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Dmitrij Frishman,et al.  STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins , 2004, Nucleic Acids Res..

[27]  Yuichi Harano,et al.  Complete protein structure determination using backbone residual dipolar couplings and sidechain rotamer prediction , 2004, Journal of Structural and Functional Genomics.

[28]  Alexander D. MacKerell,et al.  Extending the treatment of backbone energetics in protein force fields: Limitations of gas‐phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations , 2004, J. Comput. Chem..

[29]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[30]  A. Liwo,et al.  Physics-based protein-structure prediction using a hierarchical protocol based on the UNRES force field: assessment in two blind tests. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[31]  R. Dror,et al.  Gaussian split Ewald: A fast Ewald mesh method for molecular simulation. , 2005, The Journal of chemical physics.

[32]  Elisha Haas,et al.  The study of protein folding and dynamics by determination of intramolecular distance distributions and their fluctuations using ensemble and single-molecule FRET measurements. , 2005, Chemphyschem : a European journal of chemical physics and physical chemistry.

[33]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[34]  Federico D. Sacerdoti,et al.  Scalable Algorithms for Molecular Dynamics Simulations on Commodity Clusters , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[35]  Michele Vendruscolo,et al.  Protein structure determination from NMR chemical shifts , 2007, Proceedings of the National Academy of Sciences.

[36]  V. Pande,et al.  Choosing weights for simulated tempering. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[37]  Jianhan Chen,et al.  Can molecular dynamics simulations provide high‐resolution refinement of protein structure? , 2007, Proteins.

[38]  Jeffrey Skolnick,et al.  All-atom ab initio folding of a diverse set of proteins. , 2006, Structure.

[39]  Young Jin Lee Mass spectrometric analysis of cross-linking sites for the structure of proteins and protein complexes. , 2008 .

[40]  Christopher M. Summa,et al.  Solvent dramatically affects protein structure refinement , 2008, Proceedings of the National Academy of Sciences.

[41]  Young Jin Lee,et al.  Mass spectrometric analysis of cross-linking sites for the structure of proteins and protein complexes. , 2008, Molecular bioSystems.

[42]  Jeffrey Skolnick,et al.  Performance of the Pro‐sp3‐TASSER server in CASP8 , 2009, Proteins.

[43]  Michael Lappe,et al.  Defining an Essence of Structure Determining Residue Contacts in Proteins , 2009, PLoS Comput. Biol..

[44]  Peter L. Freddolino,et al.  Force field bias in protein folding simulations. , 2009, Biophysical journal.

[45]  John L. Klepeis,et al.  Millisecond-scale molecular dynamics simulations on Anton , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[46]  David E. Kim,et al.  Sampling bottlenecks in de novo protein structure prediction. , 2009, Journal of molecular biology.

[47]  James E. Fitzgerald,et al.  Mimicking the folding pathway to improve homology-free protein structure prediction , 2009, Proceedings of the National Academy of Sciences.

[48]  Klaus Schulten,et al.  Challenges in protein-folding simulations , 2010 .

[49]  Klaus Schulten,et al.  Challenges in protein folding simulations: Timescale, representation, and analysis. , 2010, Nature physics.

[50]  Joseph A. Bank,et al.  Supporting Online Material Materials and Methods Figs. S1 to S10 Table S1 References Movies S1 to S3 Atomic-level Characterization of the Structural Dynamics of Proteins , 2022 .

[51]  Thomas A. Hopf,et al.  Protein 3D Structure Computed from Evolutionary Sequence Variation , 2011, PloS one.

[52]  K. Lindorff-Larsen,et al.  How robust are protein folding simulations with respect to force field parameterization? , 2011, Biophysical journal.

[53]  Christodoulos A. Floudas,et al.  CONCORD: a consensus method for protein secondary structure prediction via mixed integer linear optimization , 2012, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[54]  Kenneth M. Merz,et al.  The Energy Computation Paradox and ab initio Protein Folding , 2011, PloS one.

[55]  Stefano Piana,et al.  Refinement of protein structure homology models via long, all‐atom molecular dynamics simulations , 2012, Proteins.

[56]  Nicholas P. Schafer,et al.  AWSEM-MD: protein structure prediction using coarse-grained physical potentials and bioinformatically based local structure biasing. , 2012, The journal of physical chemistry. B.

[57]  Massimiliano Pontil,et al.  PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments , 2012, Bioinform..

[58]  R. Best Atomistic molecular simulations of protein folding. , 2012, Current opinion in structural biology.

[59]  Ben M. Webb,et al.  Putting the Pieces Together: Integrative Modeling Platform Software for Structure Determination of Macromolecular Assemblies , 2012, PLoS biology.

[60]  Thomas A. Hopf,et al.  Protein structure prediction from sequence variation , 2012, Nature Biotechnology.

[61]  Diwakar Shukla,et al.  To milliseconds and beyond: challenges in the simulation of protein folding. , 2013, Current opinion in structural biology.

[62]  K. Lindorff-Larsen,et al.  Atomic-level description of ubiquitin folding , 2013, Proceedings of the National Academy of Sciences.

[63]  Sriram Subramaniam,et al.  Cryo‐electron microscopy – a primer for the non‐microscopist , 2013, The FEBS journal.

[64]  Zhiyong Wang,et al.  Predicting protein contact map using evolutionary and physical constraints by integer programming , 2013, Bioinform..

[65]  Chin-Hsien Tai,et al.  Assessment of CASP10 contact‐assisted predictions , 2014, Proteins.

[66]  Yang Zhang,et al.  The I-TASSER Suite: protein structure and function prediction , 2014, Nature Methods.

[67]  Vahid Mirjalili,et al.  Physics‐based protein structure refinement through multiple molecular dynamics trajectories and structure averaging , 2014, Proteins.

[68]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP) — round x , 2014, Proteins.

[69]  Massimiliano Bonomi,et al.  Determining Protein Complex Structures Based on a Bayesian Model of in Vivo Förster Resonance Energy Transfer (FRET) Data* , 2014, Molecular & Cellular Proteomics.