Optimal Data Verification Tests

The problem of data verification may be described as two sides that have concluded a contract stipulating that one side the inspectee is to report a set of data to the other the inspector. The inspector has to decide on the basis of his own measurements whether to accept the data reported by the inspectee as correct or to assume they have been falsified. This situation is modeled as a statistical game, and practical solutions are supplied. We consider the verification of n data with a sample of size k, and prove that the traditional D-test is optimal for both maximum and minimum sample sizes k = n and k = 1, respectively. These outcomes are, of course, to be expected; however, optimal falsification strategies are also obtained in these cases. In the special case in which two out of three data sets are verified, strong numerical evidence indicates that the D-test is no longer optimal if the total falsification is high. By applying the results obtained for k = 1 to an arbitrary sample size k, however, we show that the D-test is optimal in the case of low total falsification.