PDBStat: a universal restraint converter and restraint analysis software package for protein NMR

The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data.

[1]  Harald Schwalbe,et al.  Blind testing of routine, fully automated determination of protein structures from NMR data. , 2012, Structure.

[2]  T F Havel,et al.  The solution structure of eglin c based on measurements of many NOEs and coupling constants and its comparison with X‐ray structures , 1992, Protein science : a publication of the Protein Society.

[3]  W. Kabsch A discussion of the solution for the best rotation to relate two sets of vectors , 1978 .

[4]  K. Wüthrich,et al.  Torsion angle dynamics for NMR structure calculation with the new program DYANA. , 1997, Journal of molecular biology.

[5]  N Go,et al.  Calculation of protein conformations by proton-proton distance constraints. A new efficient algorithm. , 1985, Journal of molecular biology.

[6]  K Wüthrich,et al.  Efficient computation of three-dimensional protein structures in solution from nuclear magnetic resonance data using the program DIANA and the supporting programs CALIBA, HABAS and GLOMSA. , 1991, Journal of molecular biology.

[7]  Robert Powers,et al.  Protein NMR recall, precision, and F-measure scores (RPF scores): structure quality assessment measures based on information retrieval statistics. , 2005, Journal of the American Chemical Society.

[8]  Robert Powers,et al.  An integrated platform for automated analysis of protein NMR structures. , 2005, Methods in enzymology.

[9]  S. Grzesiek,et al.  NMRPipe: A multidimensional spectral processing system based on UNIX pipes , 1995, Journal of biomolecular NMR.

[10]  Gaetano T Montelione,et al.  Evaluating protein structures determined by structural genomics consortia , 2006, Proteins.

[11]  G. Montelione,et al.  Automated analysis of protein NMR assignments using methods from artificial intelligence. , 1997, Journal of molecular biology.

[12]  Timothy F. Havel,et al.  Solution conformation of proteinase inhibitor IIA from bull seminal plasma by 1H nuclear magnetic resonance and distance geometry. , 1985, Journal of molecular biology.

[13]  J H Prestegard,et al.  Order matrix analysis of residual dipolar couplings using singular value decomposition. , 1999, Journal of magnetic resonance.

[14]  Ad Bax,et al.  Validation of Protein Structure from Anisotropic Carbonyl Chemical Shifts in a Dilute Liquid Crystalline Phase , 1998 .

[15]  Binchen Mao,et al.  Improved technologies now routinely provide protein NMR structures useful for molecular replacement. , 2011, Structure.

[16]  David Baker,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improving NMR protein structure quality by Rosetta refinement: A molecular , 2022 .

[17]  Gaetano T Montelione,et al.  Assessing model accuracy using the homology modeling automatically software , 2007, Proteins.

[18]  Charles D Schwieters,et al.  The Xplor-NIH NMR molecular structure determination package. , 2003, Journal of magnetic resonance.

[19]  John L. Markley,et al.  NRG-CING: integrated validation reports of remediated experimental biomolecular NMR data and coordinates in wwPDB , 2011, Nucleic Acids Res..

[20]  Wayne Boucher,et al.  The CCPN data model for NMR spectroscopy: Development of a software pipeline , 2005, Proteins.

[21]  Oliver F. Lange,et al.  Determination of solution structures of proteins up to 40 kDa using CS-Rosetta with sparse NMR data from deuterated samples , 2012, Proceedings of the National Academy of Sciences.

[22]  Homayoun Valafar,et al.  REDCAT: a residual dipolar coupling analysis tool. , 2004, Journal of magnetic resonance.

[23]  G. Montelione,et al.  Assignment validation software suite for the evaluation and presentation of protein resonance assignment data , 2004, Journal of biomolecular NMR.

[24]  Gerard J Kleywegt,et al.  Vivaldi: Visualization and validation of biomacromolecular NMR structures from the PDB , 2012, Proteins.

[25]  Gaetano T Montelione,et al.  Automated analysis of protein NMR assignments and structures. , 2004, Chemical reviews.

[26]  Gert Vriend,et al.  CING: an integrated residue-based structure validation program suite , 2012, Journal of Biomolecular NMR.

[27]  Ad Bax,et al.  Prediction of Sterically Induced Alignment in a Dilute Liquid Crystalline Phase: Aid to Protein Structure Determination by NMR , 2000 .

[28]  Mia Hubert,et al.  Clustering in an object-oriented environment , 1997 .

[29]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[30]  Gert Vriend,et al.  Traditional Biomolecular Structure Determination by NMR Spectroscopy Allows for Major Errors , 2005, PLoS Comput. Biol..

[31]  Simon W. Ginzinger,et al.  SHIFTX2: significantly improved protein chemical shift prediction , 2011, Journal of biomolecular NMR.

[32]  R J Read,et al.  Crystallography & NMR system: A new software suite for macromolecular structure determination. , 1998, Acta crystallographica. Section D, Biological crystallography.

[33]  Antonio Rosato,et al.  RPF: a quality assessment tool for protein NMR structures , 2012, Nucleic Acids Res..

[34]  P E Wright,et al.  Recommendations for the presentation of NMR structures of proteins and nucleic acids. , 1998, Journal of molecular biology.

[35]  Torsten Herrmann,et al.  Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA. , 2002, Journal of molecular biology.

[36]  G. Montelione,et al.  Simulated annealing with restrained molecular dynamics using CONGEN: Energy refinement of the NMR solution structures of epidermal and type‐α transforming growth factors , 1996, Protein science : a publication of the Protein Society.

[37]  James M Aramini,et al.  SPINS: A laboratory information management system for organizing and archiving intermediate and final results from NMR protein structure determinations , 2006, Proteins.

[38]  Arash Bahrami,et al.  Probabilistic Interaction Network of Evidence Algorithm and its Application to Complete Labeling of Peak Lists from Protein NMR Spectroscopy , 2009, PLoS Comput. Biol..

[39]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[40]  Timothy F. Havel,et al.  An evaluation of the combined use of nuclear magnetic resonance and distance geometry for the determination of protein conformations in solution. , 1985, Journal of molecular biology.

[41]  Roberto Tejero,et al.  Simulated annealing with restrained molecular dynamics using a flexible restraint potential: Theory and evaluation with simulated NMR constraints , 1996, Protein science : a publication of the Protein Society.

[42]  Peter Güntert,et al.  Objective identification of residue ranges for the superposition of protein structures , 2011, BMC Bioinformatics.

[43]  Ton Rullmann,et al.  Completeness of NOEs in protein structures: A statistical analysis of NMR data , 1999 .

[44]  Robert Powers,et al.  A topology‐constrained distance network algorithm for protein structure determination from NOESY data , 2005, Proteins.

[45]  Jun Zhu,et al.  BioMagResBank database with sets of experimental NMR constraints corresponding to the structures of over 1400 biomolecules deposited in the Protein Data Bank , 2003, Journal of biomolecular NMR.

[46]  Gaohua Liu,et al.  NMR data collection and analysis protocol for high-throughput protein structure determination. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Oliver F. Lange,et al.  NMR Structure Determination for Larger Proteins Using Backbone-Only Data , 2010, Science.

[48]  H N Moseley,et al.  Automated analysis of NMR assignments and structures for proteins. , 1999, Current opinion in structural biology.

[49]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[50]  H N Moseley,et al.  Automatic determination of protein backbone resonance assignments from triple resonance nuclear magnetic resonance data. , 2001, Methods in enzymology.

[51]  Gaetano T Montelione,et al.  Clustering algorithms for identifying core atom sets and for assessing the precision of protein structure ensembles , 2005, Proteins.

[52]  M Nilges,et al.  Calculation of protein structures with ambiguous distance restraints. Automated assignment of ambiguous NOE crosspeaks and disulphide connectivities. , 1995, Journal of molecular biology.