Can molecular dynamics simulations help in discriminating correct from erroneous protein 3D models?

BackgroundRecent approaches for predicting the three-dimensional (3D) structure of proteins such as de novo or fold recognition methods mostly rely on simplified energy potential functions and a reduced representation of the polypeptide chain. These simplifications facilitate the exploration of the protein conformational space but do not permit to capture entirely the subtle relationship that exists between the amino acid sequence and its native structure. It has been proposed that physics-based energy functions together with techniques for sampling the conformational space, e.g., Monte Carlo or molecular dynamics (MD) simulations, are better suited to the task of modelling proteins at higher resolutions than those of models obtained with the former type of methods. In this study we monitor different protein structural properties along MD trajectories to discriminate correct from erroneous models. These models are based on the sequence-structure alignments provided by our fold recognition method, FROST. We define correct models as being built from alignments of sequences with structures similar to their native structures and erroneous models from alignments of sequences with structures unrelated to their native structures.ResultsFor three test sequences whose native structures belong to the all-α, all-β and αβ classes we built a set of models intended to cover the whole spectrum: from a perfect model, i.e., the native structure, to a very poor model, i.e., a random alignment of the test sequence with a structure belonging to another structural class, including several intermediate models based on fold recognition alignments. We submitted these models to 11 ns of MD simulations at three different temperatures. We monitored along the corresponding trajectories the mean of the Root-Mean-Square deviations (RMSd) with respect to the initial conformation, the RMSd fluctuations, the number of conformation clusters, the evolution of secondary structures and the surface area of residues. None of these criteria alone is 100% efficient in discriminating correct from erroneous models. The mean RMSd, RMSd fluctuations, secondary structure and clustering of conformations show some false positives whereas the residue surface area criterion shows false negatives. However if we consider these criteria in combination it is straightforward to discriminate the two types of models.ConclusionThe ability of discriminating correct from erroneous models allows us to improve the specificity and sensitivity of our fold recognition method for a number of ambiguous cases.

[1]  D. Baker,et al.  Molecular dynamics in the endgame of protein structure prediction. , 2001, Journal of molecular biology.

[2]  Gerrit Groenhof,et al.  GROMACS: Fast, flexible, and free , 2005, J. Comput. Chem..

[3]  Hao Fan,et al.  Refinement of homology‐based protein structures by molecular dynamics simulation techniques , 2004, Protein science : a publication of the Protein Society.

[4]  M. Karplus,et al.  An analysis of incorrectly folded protein models. Implications for structure predictions. , 1984, Journal of molecular biology.

[5]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[6]  Tim J. P. Hubbard,et al.  SCOP database in 2004: refinements integrate structure and sequence family data , 2004, Nucleic Acids Res..

[7]  Ceslovas Venclovas,et al.  Progress over the first decade of CASP experiments , 2005, Proteins.

[8]  M. Karplus,et al.  Effective energy functions for protein structure prediction. , 2000, Current opinion in structural biology.

[9]  M. Karplus,et al.  Locally accessible conformations of proteins: Multiple molecular dynamics simulations of crambin , 1998, Protein science : a publication of the Protein Society.

[10]  Anthony K. Felts,et al.  Distinguishing native conformations of proteins from decoys with an effective free energy estimator based on the OPLS all‐atom force field and the surface generalized born solvent model , 2002, Proteins.

[11]  Dominik Gront,et al.  BMC Structural Biology BioMed Central , 2007 .

[12]  Jan Hermans,et al.  Discrimination between native and intentionally misfolded conformations of proteins: ES/IS, a new method for calculating conformational free energy that uses both dynamics simulations with an explicit solvent and an implicit solvent continuum model , 1998, Proteins.

[13]  Jean-François Gibrat,et al.  FROST: A filter‐based fold recognition method , 2002, Proteins.

[14]  D. Eisenberg,et al.  VERIFY3D: assessment of protein models with three-dimensional profiles. , 1997, Methods in enzymology.

[15]  A. Liwo,et al.  Physics-based protein-structure prediction using a hierarchical protocol based on the UNRES force field: assessment in two blind tests. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[16]  B. Hess,et al.  Similarities between principal components of protein dynamics and random diffusion , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[17]  K Schulten,et al.  VMD: visual molecular dynamics. , 1996, Journal of molecular graphics.

[18]  Jianhan Chen,et al.  Can molecular dynamics simulations provide high‐resolution refinement of protein structure? , 2007, Proteins.

[19]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[20]  D. Pérahia,et al.  Unfolding of hen egg lysozyme by molecular dynamics simulations at 300K: Insight into the role of the interdomain interface , 2000, Proteins.

[21]  R. Bruccoleri,et al.  Criteria that discriminate between native proteins and incorrectly folded models , 1988, Proteins.

[22]  J. Skolnick,et al.  Combining MONSSTER and LES/PME to Predict Protein Structure from Amino Acid Sequence: Application to the Small Protein CMTI-1 , 2000 .

[23]  K. Nishikawa,et al.  Physicochemical evaluation of protein folds predicted by threading , 2000, European Biophysics Journal.

[24]  Patrice Koehl,et al.  ASTRAL compendium enhancements , 2002, Nucleic Acids Res..

[25]  M. Karplus,et al.  Discrimination of the native from misfolded protein models with an energy function including implicit solvation. , 1999, Journal of molecular biology.

[26]  K. Bryson,et al.  AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed system , 2006, Nucleic acids research.

[27]  A. Sali,et al.  Statistical potentials for fold assessment , 2009 .

[28]  S Vajda,et al.  Selecting near‐native conformations in homology modeling: The role of molecular mechanics and solvation terms , 1998, Protein science : a publication of the Protein Society.

[29]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[30]  B. Hess Convergence of sampling in protein simulations. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  S Vajda,et al.  Discrimination of near‐native protein structures from misfolded models by empirical free energy functions , 2000, Proteins.

[32]  O. Schueler‐Furman,et al.  Progress in Modeling of Protein Structures and Interactions , 2005, Science.

[33]  S. Bryant,et al.  Threading a database of protein cores , 1995, Proteins.

[34]  Charles L. Brooks,et al.  Identifying native‐like protein structures using physics‐based potentials , 2002, J. Comput. Chem..

[35]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[36]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.