High-resolution structure prediction and the crystallographic phase problem

The energy-based refinement of low-resolution protein structure models to atomic-level accuracy is a major challenge for computational structural biology. Here we describe a new approach to refining protein structure models that focuses sampling in regions most likely to contain errors while allowing the whole structure to relax in a physically realistic all-atom force field. In applications to models produced using nuclear magnetic resonance data and to comparative models based on distant structural homologues, the method can significantly improve the accuracy of the structures in terms of both the backbone conformations and the placement of core side chains. Furthermore, the resulting models satisfy a particularly stringent test: they provide significantly better solutions to the X-ray crystallographic phase problem in molecular replacement trials. Finally, we show that all-atom refinement can produce de novo protein structure predictions that reach the high accuracy required for molecular replacement without any experimental phase information and in the absence of templates suitable for molecular replacement from the Protein Data Bank. These results suggest that the combination of high-resolution structure prediction with state-of-the-art phasing tools may be unexpectedly powerful in phasing crystallographic data for which molecular replacement is hindered by the absence of sufficiently accurate previous models.

[1]  W L Harnett,et al.  Statistical Survey , 1943, Postgraduate medical journal.

[2]  H. Scheraga,et al.  Monte Carlo-minimization approach to the multiple-minima problem in protein folding. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[3]  R. Huber,et al.  Accurate Bond and Angle Parameters for X-ray Protein Structure Refinement , 1991 .

[4]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[5]  M. Rossmann,et al.  Ab initio phase determination and phase extension using non-crystallographic symmetry. , 1995, Current opinion in structural biology.

[6]  R. Read,et al.  Improved Structure Refinement Through Maximum Likelihood , 1996 .

[7]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[8]  Roland L. Dunbrack,et al.  Bayesian statistical analysis of protein side‐chain rotamer preferences , 1997, Protein science : a publication of the Protein Society.

[9]  J. Rullmann,et al.  Quality assessment of NMR structures: a statistical survey. , 1998, Journal of molecular biology.

[10]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[11]  A. Liwo,et al.  Energy-based de novo protein folding by conformational space annealing and an off-lattice united-residue force field: application to the 10-55 fragment of staphylococcal protein A and to apo calbindin D9K. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[12]  H. Scheraga,et al.  Global optimization of clusters, crystals, and biomolecules. , 1999, Science.

[13]  Gadd45a knockout mice resemble p53 knockouts , 2000, Genome Biology.

[14]  Anastassis Perrakis,et al.  Automated protein model building combined with iterative structure refinement , 1999, Nature Structural Biology.

[15]  Malin M. Young,et al.  High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry , 2000, Proc. Natl. Acad. Sci. USA.

[16]  Gregory A Petsko The Grail problem , 2000, Genome Biology.

[17]  D. Baker,et al.  Native protein sequences are close to optimal for their structures. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[18]  E J Dodson,et al.  Does NMR mean "not for molecular replacement"? Using NMR-based search models to solve protein crystal structures. , 2000, Structure.

[19]  Richard Bonneau,et al.  Improving the performance of rosetta using multiple sequence alignment information and global measures of hydrophobic core formation , 2001, Proteins.

[20]  Zbigniew Dauter,et al.  New approaches to high-throughput phasing. , 2002, Current opinion in structural biology.

[21]  Osvaldo Olmea,et al.  MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison , 2002, Protein science : a publication of the Protein Society.

[22]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[23]  Thomas C. Terwilliger,et al.  Electronic Reprint Biological Crystallography Automated Main-chain Model Building by Template Matching and Iterative Fragment Extension , 2022 .

[24]  Adrian A Canutescu,et al.  Cyclic coordinate descent: A robotics algorithm for protein loop closure , 2003, Protein science : a publication of the Protein Society.

[25]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[26]  Ian W. Davis,et al.  Structure validation by Cα geometry: ϕ,ψ and Cβ deviation , 2003, Proteins.

[27]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[28]  Virgil L. Woods,et al.  Protein structure change studied by hydrogen-deuterium exchange, functional labeling, and mass spectrometry , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Ad Bax,et al.  An empirical backbone-backbone hydrogen-bonding potential in proteins and its applications to NMR structure refinement and validation. , 2004, Journal of the American Chemical Society.

[30]  Dariusz Plewczynski,et al.  Comparison of proteins based on segments structural similarity. , 2004, Acta biochimica Polonica.

[31]  Adam Godzik,et al.  The importance of alignment accuracy for molecular replacement. , 2004, Acta crystallographica. Section D, Biological crystallography.

[32]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[33]  K. Misura,et al.  PROTEINS: Structure, Function, and Bioinformatics 59:15–29 (2005) Progress and Challenges in High-Resolution Refinement of Protein Structure Models , 2022 .

[34]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[35]  Michael Nilges,et al.  Materials and Methods Som Text Figs. S1 to S6 References Movies S1 to S5 Inferential Structure Determination , 2022 .

[36]  O. Schueler‐Furman,et al.  Improved side‐chain modeling for protein–protein docking , 2005, Protein science : a publication of the Protein Society.

[37]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[38]  B. Rost,et al.  Critical assessment of methods of protein structure prediction (CASP)—Round 6 , 2005 .

[39]  Anna Tramontano,et al.  Evaluating the usefulness of protein structure models for molecular replacement , 2005, ECCB/JBI.

[40]  R. Ficner,et al.  Crystal structure of the archaeal ammonium transporter Amt-1 from Archaeoglobus fulgidus. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[41]  M. Y. Lobanov,et al.  Comparison of X‐ray and NMR structures: Is there a systematic difference in residue contacts between X‐ray‐ and NMR‐resolved protein structures? , 2005, Proteins.

[42]  John Moult,et al.  A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. , 2005, Current opinion in structural biology.

[43]  Jack Snoeyink,et al.  Rotamer-Pair Energy Calculations Using a Trie Data Structure , 2005, WABI.

[44]  Keiji Takamoto,et al.  Radiolytic protein footprinting with mass spectrometry to probe the structure of macromolecular complexes. , 2006, Annual review of biophysics and biomolecular structure.

[45]  Marc A. Martí-Renom,et al.  MODBASE: a database of annotated comparative protein structure models and associated resources , 2005, Nucleic Acids Res..

[46]  Arne Elofsson,et al.  Identification of correct regions in protein models using structural, alignment, and consensus information , 2006, Protein science : a publication of the Protein Society.

[47]  D. Baker,et al.  Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection , 2006, Nucleic acids research.

[48]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[49]  Axel T. Brunger,et al.  Ab initio molecular-replacement phasing for symmetric helical membrane proteins , 2007, Acta crystallographica. Section D, Biological crystallography.

[50]  Lars Malmström,et al.  Structure prediction for CASP7 targets using extensive all‐atom refinement with Rosetta@home , 2007, Proteins.

[51]  Randy J. Read,et al.  Phaser crystallographic software , 2007, Journal of applied crystallography.