Efficiency analysis of sampling protocols used in protein crystallization screening

Abstract In an effort to objectively compare the efficiency of protein crystallization screening techniques, a probability model of sampling efficiency is developed and used to calculate sampling efficiencies from experimental data. Three typical sampling protocols (grid screening, footprint screening, and random screening) are used to crystallize each of five proteins (Phospholipase A 2 , Thaumatin, Catalase, Lysozyme, and Ribonuclease B). For each of the three sampling protocols, experiments are chosen from a large set of possible experiments generated by systematic combination of a number of parameters common in crystallization screens. Software has been developed to generate and select from the combinations with each of the three sampling protocols examined in this study. The protocols differ only in the order samples are chosen from the set of possible combinations. Random sampling is motivated by the “Incomplete Factorial” screen (Carter and Carter, J. Biol. Chem. 254 (1979) 12 219); sampling with subsets of four is motivated by the “Footprint” screen (Stura et al., J. Crystal Growth 122 (1992) 273) and sampling with subsets of twenty-four is motivated by the “Grid” screen (McPherson, Prepartion and Analysis of Protein Crystals, Wiley, New York, 1982). For the five proteins examined, random sampling has the greatest average efficiency. Additional benefits of random sampling are discussed.