A novel method for predicting and using distance constraints of high accuracy for refining protein structure prediction

The principal bottleneck in protein structure prediction is the refinement of models from lower accuracies to the resolution observed by experiment. We developed a novel constraints‐based refinement method that identifies a high number of accurate input constraints from initial models and rebuilds them using restrained torsion angle dynamics (rTAD). We previously created a Bayesian statistics‐based residue‐specific all‐atom probability discriminatory function (RAPDF) to discriminate native‐like models by measuring the probability of accuracy for atom type distances within a given model. Here, we exploit RAPDF to score (i.e., filter) constraints from initial predictions that may or may not be close to a native‐like state, obtain consensus of top scoring constraints amongst five initial models, and compile sets with no redundant residue pair constraints. We find that this method consistently produces a large and highly accurate set of distance constraints from which to build refinement models. We further optimize the balance between accuracy and coverage of constraints by producing multiple structure sets using different constraint distance cutoffs, and note that the cutoff governs spatially near versus distant effects in model generation. This complete procedure of deriving distance constraints for rTAD simulations improves the quality of initial predictions significantly in all cases evaluated by us. Our procedure represents a significant step in solving the protein structure prediction and refinement problem, by enabling the use of consensus constraints, RAPDF, and rTAD for protein structure modeling and refinement. Proteins 2009. © 2009 Wiley‐Liss, Inc.

[1]  Ram Samudrala,et al.  Distance geometry generates native‐like folds for small helical proteins using the consensus distances of predicted protein structures , 1998, Protein science : a publication of the Protein Society.

[2]  O. Schueler‐Furman,et al.  Progress in Modeling of Protein Structures and Interactions , 2005, Science.

[3]  Arne Elofsson,et al.  All are not equal: A benchmark of different homology modeling programs , 2005, Protein science : a publication of the Protein Society.

[4]  M. Levitt,et al.  Potential energy function and parameters for simulations of the molecular dynamics of proteins and nucleic acids in solution , 1995 .

[5]  Roland L. Dunbrack,et al.  Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool. , 1997, Journal of molecular biology.

[6]  Ram Samudrala,et al.  The effect of experimental resolution on the performance of knowledge-based discriminatory functions for protein structure selection. , 2006, Protein engineering, design & selection : PEDS.

[7]  R. Levy,et al.  Global folding of proteins using a limited number of distance constraints. , 1993, Protein engineering.

[8]  R Samudrala,et al.  Handling context‐sensitivity in protein structures using graph theory: Bona fide prediction , 1997, Proteins.

[9]  E. Huang,et al.  Ab initio fold prediction of small helical proteins using distance geometry and knowledge-based scoring functions. , 1999, Journal of molecular biology.

[10]  D. Baker,et al.  Protein structure prediction in 2002. , 2002, Current opinion in structural biology.

[11]  J. Skolnick,et al.  Fold assembly of small proteins using monte carlo simulations driven by restraints derived from multiple sequence alignments. , 1998, Journal of molecular biology.

[12]  Andrzej Kolinski,et al.  Contact prediction in protein modeling: Scoring, folding and refinement of coarse-grained models , 2008, BMC Structural Biology.

[13]  Chao Zhang,et al.  Fold prediction of helical proteins using torsion angle dynamics and predicted restraints , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Jens Meiler,et al.  CASP6 assessment of contact prediction , 2005, Proteins.

[15]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[16]  J Moult,et al.  Predicting protein three-dimensional structure. , 1999, Current opinion in biotechnology.

[17]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[18]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[19]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[20]  Alfonso Valencia,et al.  Protein Refinement: A New Challenge For Casp In Its 10th Anniversary , 2005, Bioinform..

[21]  John P. Overington,et al.  From comparisons of protein sequences and structures to protein modelling and design. , 1990, Trends in biochemical sciences.

[22]  G. N. Ramachandran,et al.  Conformation of polypeptides and proteins. , 1968, Advances in protein chemistry.

[23]  J. Skolnick,et al.  Assembly of protein structure from sparse experimental data: An efficient Monte Carlo model , 1998, Proteins.

[24]  K. Wüthrich,et al.  Torsion angle dynamics for NMR structure calculation with the new program DYANA. , 1997, Journal of molecular biology.

[25]  G. Vriend,et al.  Homology modeling. , 2020, Methods of biochemical analysis.

[26]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[27]  R. Schulz,et al.  Protein Structure Prediction , 2020, Methods in Molecular Biology.

[28]  Nicholas Furnham,et al.  Conformer generation under restraints. , 2006, Current opinion in structural biology.

[29]  Liliana Wroblewska,et al.  Protein model refinement using an optimized physics-based all-atom force field , 2008, Proceedings of the National Academy of Sciences.

[30]  Kai Wang,et al.  Protein Meta-Functional Signatures from Combining Sequence, Structure, Evolution, and Amino Acid Property Information , 2008, PLoS Comput. Biol..

[31]  Ram Samudrala,et al.  An automated assignment-free Bayesian approach for accurately identifying proton contacts from NOESY data , 2006, Journal of biomolecular NMR.

[32]  W. Braun,et al.  Predicting the helix packing of globular proteins by self‐correcting distance geometry , 1995, Protein science : a publication of the Protein Society.

[33]  W. Braun,et al.  Pattern recognition and self‐correcting distance geometry calculations applied to myohemerythrin , 1994, FEBS letters.

[34]  E S Huang,et al.  Factors affecting the ability of energy functions to discriminate correct from incorrect folds. , 1997, Journal of molecular biology.

[35]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction—Round VII , 2007, Proteins.

[36]  D. Cozzetto,et al.  Relationship between multiple sequence alignments and quality of protein comparative models , 2004, Proteins.

[37]  A. Sali,et al.  Modeller: generation and refinement of homology-based protein structure models. , 2003, Methods in enzymology.

[38]  J. Skolnick,et al.  MONSSTER: a method for folding globular proteins with a small number of distance restraints. , 1997, Journal of molecular biology.

[39]  E A Merritt,et al.  Raster3D: photorealistic molecular graphics. , 1997, Methods in enzymology.

[40]  Malin M. Young,et al.  High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry , 2000, Proc. Natl. Acad. Sci. USA.

[41]  Lei Wang,et al.  Accuracy of structure-derived properties in simple comparative models of protein structures , 2005, Nucleic acids research.

[42]  W. Taylor,et al.  Protein modeling by multiple sequence threading and distance geometry , 1997, Proteins.

[43]  Roland L. Dunbrack,et al.  CAFASP3: The third critical assessment of fully automated structure prediction methods , 2003, Proteins.

[44]  Shing-Chung Ngan,et al.  PROTINFO: new algorithms for enhanced protein structure predictions , 2005, Nucleic Acids Res..

[45]  W. Taylor,et al.  Global fold determination from a small number of distance restraints. , 1995, Journal of molecular biology.

[46]  Wolfram Gronwald,et al.  A restraint molecular dynamics and simulated annealing approach for protein homology modeling utilizing mean angles , 2005, BMC Bioinformatics.

[47]  Timothy F. Havel,et al.  The combinatorial distance geometry method for the calculation of molecular conformation. I. A new approach to an old problem. , 1983, Journal of theoretical biology.

[48]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[49]  A. Godzik The structural alignment between two proteins: Is there a unique answer? , 1996, Protein science : a publication of the Protein Society.

[50]  P. Bradley,et al.  High-resolution structure prediction and the crystallographic phase problem , 2007, Nature.

[51]  John Moult,et al.  A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. , 2005, Current opinion in structural biology.