Free R value: a novel statistical quantity for assessing the accuracy of crystal structures

THE determination of macromolecular structure by crystallography involves fitting atomic models to the observed diffraction data1. The traditional measure of the quality of this fit, and presumably the accuracy of the model, is theR value. Despite stereochemical restraints2, it is possible to overfit or 'misfit' the diffraction data: an incorrect model can be refined to fairly good R values as several recent examples have shown3. Here I propose a reliable and unbiased indicator of the accuracy of such models. By analogy with the cross-validation method4,5 of testing statistical models I define a statistical quantity (RfreeT) that measures the agreement between observed and computed structure factor amplitudes for a 'test' set of reflections that is omitted in the modelling and refinement process. As examples show, there is a high correlation between RfreeT and the accuracy of the atomic model phases. This is useful because experimental phase information is usually inaccurate, incomplete or unavailable. I expect that RfreeT will provide a measure of the information content of recently proposed models of thermal motion and disorder6–8, time-averaging9 and bulk solvent10.