The Data Quality Concept of Accuracy in the Context of Public Use Data Sets Discussion Papers

Like other data quality dimensions, the concept of accuracy is often adopted to characterise a particular data set. However, its common specification basically refers to statistical properties of estimators, which can hardly be proved by means of a single survey at hand. This ambiguity can be resolved by assigning ‘accuracy’ to survey processes that are known to affect these properties. In this contribution, we consider the sub-process of imputation as one important step in setting up a data set and argue that the so called ‘hit-rate’ criterion, that is intended to measure the accuracy of a data set by some distance function of ‘true’ but unobserved and imputed values, is neither required nor desirable. In contrast, the so-called ‘inference’ criterion allows for valid inferences based on a suitably completed data set under rather general conditions. The underlying theoretical concepts are illustrated by means of a simulation study. It is emphasised that the same principal arguments apply to other survey processes that introduce uncertainty into an edited data set.