Ten years of probabilistic estimates of biocrystal solvent content: new insights via nonparametric kernel density estimate.

The probabilistic estimate of the solvent content (Matthews probability) was first introduced in 2003. Given that the Matthews probability is based on prior information, revisiting the empirical foundation of this widely used solvent-content estimate is appropriate. The parameter set for the original Matthews probability distribution function employed in MATTPROB has been updated after ten years of rapid PDB growth. A new nonparametric kernel density estimator has been implemented to calculate the Matthews probabilities directly from empirical solvent-content data, thus avoiding the need to revise the multiple parameters of the original binned empirical fit function. The influence and dependency of other possible parameters determining the solvent content of protein crystals have been examined. Detailed analysis showed that resolution is the primary and dominating model parameter correlated with solvent content. Modifications of protein specific density for low molecular weight have no practical effect, and there is no correlation with oligomerization state. A weak, and in practice irrelevant, dependency on symmetry and molecular weight is present, but cannot be satisfactorily explained by simple linear or categorical models. The Bayesian argument that the observed resolution represents only a lower limit for the true diffraction potential of the crystal is maintained. The new kernel density estimator is implemented as the primary option in the MATTPROB web application at http://www.ruppweb.org/mattprob/.

[1]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[2]  Randy J. Read,et al.  Acta Crystallographica Section D Biological , 2003 .

[3]  George M. Sheldrick,et al.  Experimental phasing with SHELXC/D/E: combining chain tracing with density modification , 2010, Acta crystallographica. Section D, Biological crystallography.

[4]  C. Larkin,et al.  Structure and properties of a truely apo form of AraC dimerization domain , 2006, Proteins.

[5]  Matthew P Jacobson,et al.  Turning a protein kinase on or off from a single allosteric site via disulfide trapping , 2011, Proceedings of the National Academy of Sciences.

[6]  B. Matthews X-ray Crystallographic Studies of Proteins , 1976 .

[7]  Haruki Nakamura,et al.  Remediation of the protein data bank archive , 2007, Nucleic Acids Res..

[8]  B. Matthews,et al.  Accurate calculation of the density of proteins. , 2000, Acta crystallographica. Section D, Biological crystallography.

[9]  M D Winn,et al.  An overview of the CCP4 project in protein crystallography: an example of a collaborative project. , 2003, Journal of synchrotron radiation.

[10]  Andrea Schmidt,et al.  On the routine use of soft X-rays in macromolecular crystallography. Part IV. Efficient determination of anomalous substructures in biomacromolecules using longer X-ray wavelengths. , 2007, Acta crystallographica. Section D, Biological crystallography.

[11]  W. Minor,et al.  Analysis of solvent content and oligomeric states in protein crystals—does symmetry matter? , 2008, Protein science : a publication of the Protein Society.

[12]  Todd O. Yeates,et al.  Why protein crystals favour some space-groups over others , 1995, Nature Structural Biology.

[13]  P. Zwart,et al.  Surprises and pitfalls arising from (pseudo)symmetry , 2007, Acta crystallographica. Section D, Biological crystallography.

[14]  R. Read,et al.  A mutant Shiga-like toxin IIe bound to its receptor Gb(3): structure of a group II Shiga-like toxin with altered binding specificity. , 2000, Structure.

[15]  B. Matthews Solvent content of protein crystals. , 1968, Journal of molecular biology.

[16]  Bernhard Rupp,et al.  Matthews coefficient probabilities: Improved estimates for unit cell contents of proteins, DNA, and protein–nucleic acid complex crystals , 2003, Protein science : a publication of the Protein Society.

[17]  B. Rupp Biomolecular Crystallography: Principles, Practice, and Application to Structural Biology , 2009 .

[18]  Alexander B. Taylor,et al.  Crystal structure of 4-oxalocrotonate tautomerase inactivated by 2-oxo-3-pentynoate at 2.4 A resolution: analysis and implications for the mechanism of inactivation and catalysis. , 1998, Biochemistry.

[19]  Z. Dauter,et al.  Weak data do not make a free lunch, only a cheap meal. , 2014, Acta crystallographica. Section D, Biological crystallography.

[20]  H. G. Nagendra,et al.  Role of water in plasticity, stability, and action of proteins: The crystal structures of lysozyme at very low levels of hydration , 1998, Proteins.

[21]  K. Diederichs,et al.  Better models by discarding data? , 2013, Acta crystallographica. Section D, Biological crystallography.

[22]  Randy J. Read,et al.  Phaser crystallographic software , 2007, Journal of applied crystallography.

[23]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[24]  J. Abrahams,et al.  Methods used in the structure determination of bovine mitochondrial F1 ATPase. , 1996, Acta crystallographica. Section D, Biological crystallography.

[25]  B. C. Wang Resolution of phase ambiguity in macromolecular crystallography. , 1985, Methods in enzymology.

[26]  Claude Sauter,et al.  Crystal Structures of the Pyrococcus abyssi Sm Core and Its Complex with RNA , 2003, The Journal of Biological Chemistry.

[27]  Igor Polikarpov,et al.  Average protein density is a molecular‐weight‐dependent function , 2004, Protein science : a publication of the Protein Society.

[28]  W. Minor,et al.  Ultratight crystal packing of a 10 kDa protein. , 2013, Acta crystallographica. Section D, Biological crystallography.