Improved Prediction Error Estimates for Multivariate Calibration by Correcting for the Measurement Error in the Reference Values

The validation of multivariate calibration models using measured reference values leads to a so-called apparent prediction error estimate, which is systematically larger than the true prediction error. The reason for this difference is clear: the measured reference values contain an irrelevant random component, the measurement error, which cannot be predicted by any model, not even the “true” one. However, the contribution of the measurement error in the reference values to the apparent prediction error estimate is interpreted as an inadequacy of the calibration model rather than an inadequacy of the reference values themselves. This phenomenon of confounding has been pointed out recently by several researchers, but no generally applicable solution was given. In this paper we propose a simple correction procedure that yields a more realistic estimate of the true prediction error. A large potential improvement over the conventional estimate is demonstrated for a variety of applications taken from the literature.