A Bayesian approach to the evaluation of comparisons of individually value-assigned reference materials

Several recent international comparison studies used a relatively novel experimental design to evaluate the measurement capabilities of participating organizations. These studies compared the values assigned by each participant to one or more qualitatively similar materials with measurements made on all of the materials by one laboratory under repeatability conditions. A statistical model was then established relating the values to the repeatability measurements; the extent of agreement between the assigned value(s) and the consensus model reflected the participants’ measurement capabilities. Since each participant used their own supplies, equipment, and methods to produce and value-assign their material(s), the agreement between the assigned value(s) and the model was a fairer reflection of their intrinsic capabilities than provided by studies that directly compared time- and material-constrained measurements on unknown samples prepared elsewhere. A new statistical procedure is presented for the analysis of such data. The procedure incorporates several novel concepts, most importantly a leave-one-out strategy for the estimation of the consensus value of the measurand, model fitting via Bayesian posterior probabilities, and posterior coverage probability calculation for the assigned 95% uncertainty intervals. The benefits of the new procedure are illustrated using data from the CCQM-K54 comparison of eight cylinders of n-hexane in methane.