Molprobity's ultimate rotamer‐library distributions for model validation

Here we describe the updated MolProbity rotamer‐library distributions derived from an order‐of‐magnitude larger and more stringently quality‐filtered dataset of about 8000 (vs. 500) protein chains, and we explain the resulting changes and improvements to model validation as seen by users. To include only side‐chains with satisfactory justification for their given conformation, we added residue‐specific filters for electron‐density value and model‐to‐density fit. The combined new protocol retains a million residues of data, while cleaning up false‐positive noise in the multi‐ χ datapoint distributions. It enables unambiguous characterization of conformational clusters nearly 1000‐fold less frequent than the most common ones. We describe examples of local interactions that favor these rare conformations, including the role of authentic covalent bond‐angle deviations in enabling presumably strained side‐chain conformations. Further, along with favored and outlier, an allowed category (0.3–2.0% occurrence in reference data) has been added, analogous to Ramachandran validation categories. The new rotamer distributions are used for current rotamer validation in MolProbity and PHENIX, and for rotamer choice in PHENIX model‐building and refinement. The multi‐dimensional χ distributions and Top8000 reference dataset are freely available on GitHub. These rotamers are termed “ultimate” because data sampling and quality are now fully adequate for this task, and also because we believe the future of conformational validation should integrate side‐chain with backbone criteria. Proteins 2016; 84:1177–1189. © 2016 Wiley Periodicals, Inc.

[1]  Vincent B. Chen,et al.  KING (Kinemage, Next Generation): A versatile interactive molecular and scientific visualization program , 2009, Protein science : a publication of the Protein Society.

[2]  Michael G Prisant,et al.  Crystallographic model validation: from diagnosis to healing. , 2013, Current opinion in structural biology.

[3]  Roland L. Dunbrack,et al.  Statistical and conformational analysis of the electron density of protein side chains , 2006, Proteins.

[4]  I Lasters,et al.  All in one: a highly detailed rotamer library improves both accuracy and speed in the modelling of sidechains by dead-end elimination. , 1997, Folding & design.

[5]  D C Richardson,et al.  Asparagine and glutamine rotamers: B-factor cutoff and correction of amide flips yield distinct clustering. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[6]  J. Richardson,et al.  Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. , 1999, Journal of molecular biology.

[7]  Jeffrey J. Headd,et al.  Autofix for backward-fit sidechains: using MolProbity and real-space refinement to put misfits in their place , 2008, Journal of Structural and Functional Genomics.

[8]  Randy J. Read,et al.  Overview of the CCP4 suite and current developments , 2011, Acta crystallographica. Section D, Biological crystallography.

[9]  James D. Herbsleb,et al.  Social coding in GitHub: transparency and collaboration in an open software repository , 2012, CSCW.

[10]  M. Zalis,et al.  Visualizing and quantifying molecular goodness-of-fit: small-probe contact dots with explicit hydrogen atoms. , 1999, Journal of molecular biology.

[11]  Randy J. Read,et al.  Dauter Iterative model building , structure refinement and density modification with the PHENIX AutoBuild wizard , 2007 .

[12]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[13]  P. Emsley,et al.  Features and development of Coot , 2010, Acta crystallographica. Section D, Biological crystallography.

[14]  C. Sander,et al.  Errors in protein structures , 1996, Nature.

[15]  P. Argos,et al.  Rotamers: to be or not to be? An analysis of amino acid side-chain conformations in globular proteins. , 1993, Journal of molecular biology.

[16]  David C. Richardson,et al.  MOLPROBITY: structure validation and all-atom contact analysis for nucleic acids and their complexes , 2004, Nucleic Acids Res..

[17]  M. Bansal,et al.  Biomolecular Forms and Functions:A Celebration of 50 Years of the Ramachandran Map , 2012 .

[18]  R. Lavery,et al.  A new approach to the rapid determination of protein side chain conformations. , 1991, Journal of biomolecular structure & dynamics.

[19]  L. Breiman,et al.  Variable Kernel Estimates of Multivariate Densities , 1977 .

[20]  Randy J. Read,et al.  Acta Crystallographica Section D Biological , 2003 .

[21]  G J Kleywegt,et al.  Phi/psi-chology: Ramachandran revisited. , 1996, Structure.

[22]  G. N. Ramachandran,et al.  Stereochemistry of polypeptide chain configurations. , 1963, Journal of molecular biology.

[23]  Roland L. Dunbrack,et al.  Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool. , 1997, Journal of molecular biology.

[24]  J. Zou,et al.  Improved methods for building protein models in electron density maps and the location of errors in these models. , 1991, Acta crystallographica. Section A, Foundations of crystallography.

[25]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[26]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[27]  J. Leunissen,et al.  Subtilases: The superfamily of subtilisin‐like serine proteases , 1997, Protein science : a publication of the Protein Society.

[28]  R. Huber,et al.  Accurate Bond and Angle Parameters for X-ray Protein Structure Refinement , 1991 .

[29]  Serge X. Cohen,et al.  Automated macromolecular model building for X-ray crystallography using ARP/wARP version 7 , 2008, Nature Protocols.

[30]  Anastassis Perrakis,et al.  Automatic rebuilding and optimization of crystallographic structures in the Protein Data Bank , 2011, Bioinform..

[31]  Roland L. Dunbrack,et al.  Bayesian statistical analysis of protein side‐chain rotamer preferences , 1997, Protein science : a publication of the Protein Society.

[32]  T. A. Jones,et al.  The Uppsala Electron-Density Server. , 2004, Acta crystallographica. Section D, Biological crystallography.

[33]  Ian W. Davis,et al.  Structure Validation by C a Geometry : f , y and C b Deviation , 2002 .

[34]  Ian W. Davis,et al.  Structure validation by Cα geometry: ϕ,ψ and Cβ deviation , 2003, Proteins.

[35]  Jan Hermans,et al.  Boltzmann‐type distribution of side‐chain conformation in proteins , 2003, Protein science : a publication of the Protein Society.

[36]  Roland L. Dunbrack Rotamer libraries in the 21st century. , 2002, Current opinion in structural biology.

[37]  Vincent B. Chen,et al.  Correspondence e-mail: , 2000 .

[38]  J. Richardson,et al.  “THE PLOT” THICKENS: MORE DATA, MORE DIMENSIONS, MORE USES , 2013 .

[39]  Randy J. Read,et al.  A New Generation of Crystallographic Validation Tools for the Protein Data Bank , 2011, Structure.

[40]  Pablo Gainza,et al.  Osprey: Protein Design with Ensembles, Flexibility, and Provable Algorithms , 2022 .

[41]  Jan Hermans,et al.  Protein imperfections: separating intrinsic from extrinsic variation of torsion angles. , 2005, Acta crystallographica. Section D, Biological crystallography.

[42]  J. Ponder,et al.  Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. , 1987, Journal of molecular biology.

[43]  J. Richardson,et al.  The penultimate rotamer library , 2000, Proteins.

[44]  Wladek Minor,et al.  Fitmunk: improving protein structures by accurate, automatic modeling of side-chain conformations , 2016, Acta crystallographica. Section D, Structural biology.

[45]  Bruce Randall Donald,et al.  Protein Design Using Continuous Rotamers , 2012, PLoS Comput. Biol..

[46]  Shuren Wang,et al.  A test of enhancing model accuracy in high-throughput crystallography , 2005, Journal of Structural and Functional Genomics.