RosettaHoles: Rapid assessment of protein core packing for structure prediction, refinement, design, and validation

We present a novel method called RosettaHoles for visual and quantitative assessment of underpacking in the protein core. RosettaHoles generates a set of spherical cavity balls that fill the empty volume between atoms in the protein interior. For visualization, the cavity balls are aggregated into contiguous overlapping clusters and small cavities are discarded, leaving an uncluttered representation of the unfilled regions of space in a structure. For quantitative analysis, the cavity ball data are used to estimate the probability of observing a given cavity in a high‐resolution crystal structure. RosettaHoles provides excellent discrimination between real and computationally generated structures, is predictive of incorrect regions in models, identifies problematic structures in the Protein Data Bank, and promises to be a useful validation tool for newly solved experimental structures.

[1]  B. Lee,et al.  The interpretation of protein structures: estimation of static accessibility. , 1971, Journal of molecular biology.

[2]  A. Fersht,et al.  Energetics of complementary side-chain packing in a protein hydrophobic core. , 1989, Biochemistry.

[3]  B. Matthews,et al.  A cavity-containing mutant of T4 lysozyme is stabilized by buried benzene , 1993, Nature.

[4]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[5]  B. Matthews,et al.  Response of a protein structure to cavity-creating mutations and its relation to the hydrophobic effect. , 1992, Science.

[6]  Kenneth M. Merz,et al.  Rapid approximation to molecular surface area via the use of Boolean logic and look‐up tables , 1993, J. Comput. Chem..

[7]  R A Sayle,et al.  RASMOL: biomolecular graphics for all. , 1995, Trends in biochemical sciences.

[8]  B. Matthews,et al.  Protein flexibility and adaptability seen in 25 crystal forms of T4 lysozyme. , 1995, Journal of molecular biology.

[9]  C. Sander,et al.  Errors in protein structures , 1996, Nature.

[10]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[11]  H Edelsbrunner,et al.  Analytical shape computation of macromolecules: II. Inaccessible cavities in proteins , 1998, Proteins.

[12]  S Subramaniam,et al.  Analytical shape computation of macromolecules: I. molecular area and volume through alpha shape , 1998, Proteins.

[13]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[14]  P. Bartlett,et al.  Probabilities for SV Machines , 2000 .

[15]  Alexander J. Smola,et al.  Advances in Large Margin Classifiers , 2000 .

[16]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[17]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[18]  Proceedings of the Seventh Meeting on the Critical Assessment of Techniques for Protein Structure Prediction. November 26-30, 2006. Pacific Grove, California, USA. , 2007, Proteins.

[19]  Randy J. Read,et al.  Crystallography: Crystallographic evidence for deviating C3b structure , 2007, Nature.

[20]  Jack Snoeyink,et al.  Nucleic Acids Research Advance Access published April 22, 2007 MolProbity: all-atom contacts and structure validation for proteins and nucleic acids , 2007 .