Driven to near‐experimental accuracy by refinement via molecular dynamics simulations

Protein model refinement has been an essential part of successful protein structure prediction. Molecular dynamics simulation‐based refinement methods have shown consistent improvement of protein models. There had been progress in the extent of refinement for a few years since the idea of ensemble averaging of sampled conformations emerged. There was little progress in CASP12 because conformational sampling was not sufficiently diverse due to harmonic restraints. During CASP13, a new refinement method was tested that achieved significant improvements over CASP12. The new method intended to address previous bottlenecks in the refinement problem by introducing new features. Flat‐bottom harmonic restraints replaced harmonic restraints, sampling was performed iteratively, and a new scoring function and selection criteria were used. The new protocol expanded conformational sampling at reduced computational costs. In addition to overall improvements, some models were refined significantly to near‐experimental accuracy.

[1]  Roland L. Dunbrack,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improved prediction of protein side-chain conformations with SCWRL4 , 2022 .

[2]  A. Laio,et al.  Escaping free-energy minima , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Jinbo Xu,et al.  Analysis of deep learning methods for blind protein contact prediction in CASP12 , 2018, Proteins.

[4]  Michael Feig,et al.  Experimental accuracy in protein structure refinement via molecular dynamics simulations , 2018, Proceedings of the National Academy of Sciences.

[5]  Chaok Seok,et al.  Refinement of unreliable local regions in template‐based protein models , 2012, Proteins.

[6]  Chaok Seok,et al.  Simultaneous refinement of inaccurate local regions and overall structure in the CASP12 protein model refinement experiment , 2018, Proteins.

[7]  David T Jones,et al.  Evaluation of predictions in the CASP10 model refinement category , 2013, Proteins.

[8]  B. L. de Groot,et al.  CHARMM36m: an improved force field for folded and intrinsically disordered proteins , 2016, Nature Methods.

[9]  Michael Feig,et al.  Computational protein structure refinement: almost there, yet still so far to go , 2017, Wiley interdisciplinary reviews. Computational molecular science.

[10]  Y. Sugita,et al.  Replica-exchange molecular dynamics method for protein folding , 1999 .

[11]  Zhen Li,et al.  Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model , 2016, bioRxiv.

[12]  T. Darden,et al.  Particle mesh Ewald: An N⋅log(N) method for Ewald sums in large systems , 1993 .

[13]  Yang Zhang,et al.  The protein structure prediction problem could be solved using the current PDB library. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Yang Zhang,et al.  A Novel Side-Chain Orientation Dependent Potential Derived from Random-Walk Reference State for Protein Fold Selection and Structure Prediction , 2010, PloS one.

[15]  A. Roitberg,et al.  Long-Time-Step Molecular Dynamics through Hydrogen Mass Repartitioning. , 2015, Journal of chemical theory and computation.

[16]  Vahid Mirjalili,et al.  Protein Structure Refinement through Structure Selection and Averaging from Molecular Dynamics Ensembles. , 2013, Journal of chemical theory and computation.

[17]  Vahid Mirjalili,et al.  Protein structure refinement via molecular‐dynamics simulations: What works and what does not? , 2016, Proteins.

[18]  Yaoqi Zhou,et al.  Specific interactions for ab initio folding of protein terminal regions with secondary structures , 2008, Proteins.

[19]  David E. Kim,et al.  One contact for every twelve residues allows robust and accurate topology‐level protein structure modeling , 2014, Proteins.

[20]  Badri Adhikari,et al.  Protein contact prediction by integrating deep multiple sequence alignments, coevolution and machine learning , 2018, Proteins.

[21]  David T. Jones,et al.  High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features , 2018, Bioinform..

[22]  Thomas J Lane,et al.  MDTraj: a modern, open library for the analysis of molecular dynamics trajectories , 2014, bioRxiv.

[23]  Christopher J. Williams,et al.  The other 90% of the protein: Assessment beyond the Cαs for CASP8 template‐based and high‐accuracy models , 2009, Proteins.

[24]  Michael Feig,et al.  Local Protein Structure Refinement via Molecular Dynamics Simulations with locPREFMD , 2016, J. Chem. Inf. Model..

[25]  Alexander D. MacKerell,et al.  Improved treatment of the protein backbone in empirical force fields. , 2004, Journal of the American Chemical Society.

[26]  Markus Gruber,et al.  CCMpred—fast and precise prediction of protein residue–residue contacts from correlated mutations , 2014, Bioinform..

[27]  Jens Meiler,et al.  ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. , 2011, Methods in enzymology.

[28]  Sotaro Fuchigami,et al.  Slow dynamics in protein fluctuations revealed by time-structure based independent component analysis: the case of domain motions. , 2011, The Journal of chemical physics.

[29]  Marcus Weber,et al.  Fuzzy spectral clustering by PCCA+: application to Markov state models and data classification , 2013, Advances in Data Analysis and Classification.

[30]  Charles L. Brooks,et al.  New analytic approximation to the standard molecular volume definition and its application to generalized Born calculations , 2003, J. Comput. Chem..

[31]  Vahid Mirjalili,et al.  Physics‐based protein structure refinement through multiple molecular dynamics trajectories and structure averaging , 2014, Proteins.

[32]  Andriy Kryshtafovych,et al.  Assessment of contact predictions in CASP12: Co‐evolution and deep learning coming of age , 2017, Proteins.

[33]  SödingJohannes Protein homology detection by HMM--HMM comparison , 2005 .

[34]  Anna Tramontano,et al.  Evaluation of the template‐based modeling in CASP12 , 2018, Proteins.

[35]  G. Ciccotti,et al.  Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-Alkanes , 1977 .

[36]  W. L. Jorgensen,et al.  Comparison of simple potential functions for simulating liquid water , 1983 .

[37]  David T Jones,et al.  Improved protein contact predictions with the MetaPSICOV2 server in CASP12 , 2018, Proteins.

[38]  Jianpeng Ma,et al.  CHARMM: The biomolecular simulation program , 2009, J. Comput. Chem..

[39]  David Baker,et al.  Protein homology model refinement by large-scale energy optimization , 2018, Proceedings of the National Academy of Sciences.

[40]  Alexander D. MacKerell,et al.  Extending the treatment of backbone energetics in protein force fields: Limitations of gas‐phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations , 2004, J. Comput. Chem..

[41]  Yang Zhang Protein structure prediction: when is it useful? , 2009, Current opinion in structural biology.

[42]  Chaok Seok,et al.  GalaxyRefine: protein structure refinement driven by side-chain repacking , 2013, Nucleic Acids Res..

[43]  Michael Feig,et al.  MMTSB Tool Set: enhanced sampling and multiscale modeling methods for applications in structural biology. , 2004, Journal of molecular graphics & modelling.

[44]  Keehyoung Joo,et al.  Protein structure modeling and refinement by global optimization in CASP12 , 2018, Proteins.

[45]  Dennis Della Corte,et al.  Protein structure refinement with adaptively restrained homologous replicas , 2016, Proteins.

[46]  Frank DiMaio,et al.  Protein structure prediction using Rosetta in CASP12 , 2018, Proteins.

[47]  Torsten Schwede,et al.  Assessment of model accuracy estimations in CASP12 , 2018, Proteins.

[48]  Mohammad M. Sultan,et al.  MSMBuilder: Statistical Models for Biomolecular Dynamics , 2016, bioRxiv.

[49]  David E. Kim,et al.  Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules. , 2016, Journal of chemical theory and computation.

[50]  Alexander D. MacKerell,et al.  Automation of the CHARMM General Force Field (CGenFF) II: Assignment of Bonded Parameters and Partial Atomic Charges , 2012, J. Chem. Inf. Model..

[51]  Michael Feig,et al.  What makes it difficult to refine protein models further via molecular dynamics simulations? , 2018, Proteins.

[52]  I. Xenarios,et al.  UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View. , 2016, Methods in molecular biology.

[53]  Vijay S. Pande,et al.  OpenMM 7: Rapid development of high performance algorithms for molecular dynamics , 2016, bioRxiv.

[54]  Alexander D. MacKerell,et al.  Automation of the CHARMM General Force Field (CGenFF) I: Bond Perception and Atom Typing , 2012, J. Chem. Inf. Model..

[55]  Georgios A. Pavlopoulos,et al.  Protein structure determination using metagenome sequence data , 2017, Science.

[56]  Maria Jesus Martin,et al.  Uniclust databases of clustered and deeply annotated protein sequences and alignments , 2016, Nucleic Acids Res..

[57]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[58]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[59]  Michael Feig,et al.  PREFMD: a web server for protein structure refinement via molecular dynamics simulations , 2018, Bioinform..