The current version of the biological macromolecule crystallization database (BMCD version 3.0) was statistically analyzed using clustering techniques. This is an effort to look for trends that may be useful in the crystallization of new macromolecules. Our previous statistical analysis of the BMCD was performed on version 1.0 [C.T. Samudzi, M.J. Fivash, J.M. Rosenberg, J. Crystal Growth 123 (1992) 47]. That database contained information on a total of 1025 crystallization experiments for 820 biological macromolecules (about 35% of those entries were incomplete and, thus, inappropriate for analysis). Version 3.0 of the BMCD is more than 90% complete and contains information on a total of about 2300 crystallization experiments for approximately 1500 biological macromolecules [G.L. Gilliland, M. Tung, D.M. Bakerslee, J.E. Ladner, Acta Cryst. D 50 (1994) 408]. With significantly more data in the BMCD, the question is whether trends have changed. The SAS software [SAS Institute Inc., SAS/STAT, Version 6, 4th ed., vol. 1] was used throughout the analysis. The following crystallization parameters were used in defining an experiment: pH, temperature, molecular weight, macromolecular concentration, precipitant type and crystallization method. Using these parameters, a measure of the differences between experiments was developed. Groups or clusters of similar experiments were identified as those close together based upon this difference measure. The database was successfully resolved into 25 clusters. The pseudo-F statistic for 25 clusters was 306.30 and is statistically significant (p < 0.0001). Although eight of these clusters can be treated as outliers, the other 17 clusters provide useful information in recognizing new patterns and developing strategies for crystallization of macromolecules.
[1]
Alexander McPherson,et al.
Preparation and analysis of protein crystals
,
1982
.
[2]
D. Kleinbaum,et al.
Applied Regression Analysis and Other Multivariate Methods
,
1978
.
[3]
John M. Rosenberg,et al.
Cluster analysis of the Biological Macromolecule Crystallization Database
,
1992
.
[4]
A. McPherson,et al.
Mechanisms of growth for protein and virus crystals
,
1995,
Nature Structural Biology.
[5]
C. Carter.
Protein crystallization using incomplete factorial experiments.
,
1979,
The Journal of biological chemistry.
[6]
H. Michel,et al.
Crystallization of membrane proteins.
,
1983,
Current opinion in structural biology.
[7]
A. Kathman,et al.
Determination of local refractive index for protein and virus crystals in solution by Mach-Zehnder interferometry.
,
1995,
Analytical biochemistry.
[8]
C. Roth,et al.
Van der Waals interactions involving proteins.
,
1996,
Biophysical journal.
[9]
G. Gilliland.
A biological macromolecule crystallization database: A basis for a crystallization strategy
,
1988
.