Statistics in respiratory medicine. 2. Repeatability and method comparison.

Repeatability and reference ranges for change A measurement that is totally unrepeatable clearly has no validity. Repeatability, however, or reproducibility, is an ambiguous concept without precise definition. To a clinical chemist it may mean reproducibility of results from an autoanalyser for a single blood sample. A technician may spend hours perfecting a non-automatic technique. But if two blood samples taken from the same subject within hours give very different results laboratory repeatability may be relatively unimportant. 'T'he repeatability of most respiratory measurements is necessarily of the "time to time" type with the interval between measurements measured at least in minutes but possibly in hours or days. Differences in the results obtained depend on the time gap, the variation increasing-that is, repeatability decreasing-with the length of the gap. Neild 7t al ' measured forced expiratory volume in one second (FEV,), peak expiratory flow (PEF), respiratory resistance, and specific airway conductance in 25 non-asthmatic subjects three times on each of three consecutive days, and reported the "within subject within day" variance of each, and also the estimate of additional variance due to variation within subjects but between days. Variances were used in the paper by Neild et al' because components of variance may be added. The usual measure of repeatability when only within day or only between day variation is studied is the within subject standard deviation. If more than two repeat measurements are carried out for some or all subjects then repeatability is most easily calculated from the results of a one way anaiysis of variance, with "subjects" as the "group" variable. The within subject standard deviation is the square root of the pooled within subject sum of squares divided by its degrees of freedom (that is, for those used to analysis of variance terminology, of the residual "mean square"). Frequently repeatability studies are carried out with just two repeat measurements per subject; indeed, for most purposes this is the most efficient design. 'Fhen it is natural to take the differences between the first and second measurements for cach subject. For example, C(hinn et al,2 from a study designed to assess rcpeatability of histamine challenge tests, reported the mean difference between the postsaline FEV values mcasured in 107 subiects <^ two occasions as 0-01 litres and the standard deviation of the differences as 0-42 1. To convert a standard deviation of differences, which has double the variance of a single FEV,, to a within subject standard deviation of a single FEV, we divide 0-42 by , 2 to get 0-30. Strictly speaking, the fact that the mean difference is not exactly zero but 0-01 should be taken into account. The within subject standard deviation is actually