Sampling designs to test land-use map accuracy

In testing the accuracy of qualitative characteristics determined from remotely sensed data, five problems arise: I . What proportions of all decisions are correct? II. What proportion of the allocation to a category is correct? 111. What proportion of the true category is correctly allocated? IV. Is a category overestimated or underestimated? V. Are the errors randomly distributed? To tackle these questions it is necessary to determine sample size (always >SO) and to adopt a stratified sampling design. The questions can then be answered using tabulated values for the binomial errors (Questions I ZV) and Poisson frequencies (Question V) . Similarly, there are circumstances in which easily collected diagnostic variables are used to predict other less easily observed characteristics. For example, field and air photo observations of aspect, slope, lithology, and vegetation might be used to "predict" soil type. Once again, the prediction method may have been based upon field data but its reliability as a method can only be ascertained by a post facto test using independently sampled field observations. In V. If error occurs in either of the ways I1 and I11 is there any bias in these errors towards specific categories? . This problem may arise in a multicategory case where some categories are acknowledged to be very similar: In such a case, mis-classification between similar categories may be high although overall accuracy is quite high. This effect appears in Table 1 where many of the errors arise from an apparent confusion between E and F. PHOTOGRAMMETRIC ENG NEERING AND REMOTE SENSING, Vol. 45, No. 4, April 1979, pp. 529-533. PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING, 1979 TABLE I. A FIELD TEST DATA TABLE Predicted characteristic: A B C D E F Characteristics A 194 6 3 2 205 identified B 3 80 1 1 85 in field C 2 7 180 189 testing D 5 5 197 207 E 1 1 180 15 197 F 1 1 15 65 82 All these questions can be answered with complete confidence if the study is a total enumeration or a very large sample. But total enumeration or very large samples involve the very heavy burden of field observation which the prediction technique is presumably designed to avoid. The method must therefore focus upon the extent to which Questions I-V can be answered by recourse to sample data sets. It should be noted that in some cases there may be sources of error other than the "prediction" system. For example, in certain types of remotely sensed imagery the matching of sites on the imagery with exact locations in the field is itself subject to error, which may then lead to identification of an apparently incorrect prediction. Similarly, a time interval between prediction and field survey may result in changes which are recorded as errors of prediction. No attempt is made to estimate such errors in the procedures described below. The question of sample size can be introduced with a simple example. Suppose that only ten sample points are checked and that the "results" indicate that all ten determinations were correct. The immediate reaction, which is quite common in some circles (Lins, 1976), is to assume that the method is 100 percent correct. However, sampling theory tells us that where there are ten trials the probability of all ten being correct is the 10th power of the true proportion of correct determinations: these probabilities are given in Table 2. On the other hand the result, 9/10 suggesting 90 percent correct, might arise from a situation where the true proportion was much higher (99 percent) or much lower (85 percent). These results are derived by using the terms of the binomial expansion. In order to establish necessary sample sizes, it is necessary to fix required confidence limits: In this discussion it is assumed that the 95 percent level will be acceptable, but that all the guidelines given would need to be recalculated if different confidence limits were to be set. By using this approach, it is possible to establish the range, at 95 percent confidence limits, within which the true proportion of errors probably lies for any specified sample size and success rate. These are tabulated for specific sample sizes by Hord and Brooner (1976) or can be presented in graphical form as in Figure 1 (see also Arkin and Colton, 1973, or Hill et al., 1961). In that figure the actual percent accuracy achieved in the sample can be related to lower and upper bounds for the range of the probable true accuracy. It is worthwhile to stress that the true value may be higher or lower than the sample value; for example, the sample value of 45/50 (90 percent) might at the 95 percent confidence limits imply a true population TABLE 2. ALTERNATIVE INTERPRETATIONS OF RESULTS FROM A SMALL SAMPLE (n = 10) (a) (b) (4 (d) If true Probability of proportion Probability Probability 9/10 or better correct: 10/10 of 9/10 (b + c) SAMPLING DESIGNS TO TEST LAND-USE MAP ACCURACY 53 1