Toward estimation of map accuracy without a probability test sample

The time and effort required of probability sampling for accuracy assessment of large-scale land cover maps often means that probability test samples are not collected. Yet, map usefulness is substantially reduced without reliable accuracy estimates. In this article, we introduce a method of estimating the accuracy of a classified map that does not utilize a test sample in the usual sense, but instead estimates the probability of correct classification for each map unit using only the classification rule and the map unit covariates. We argue that the method is an improvement over conventional estimators, though it does not eliminate the need for probability sampling. The method also provides a new and simple method of constructing accuracy maps. We illustrate some of problems associated with accuracy assessment of broad-scale land cover maps, and our method, with a set of nine Landsat Thematic Mapper satellite image-based land cover maps from Montana and Wyoming, USA.

[1]  K. Pearson,et al.  Biometrika , 1902, The American Naturalist.

[2]  D. Cox Two further applications of a model for binary regression , 1958 .

[3]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[4]  Godfried T. Toussaint,et al.  Bibliography on estimation of misclassification , 1974, IEEE Trans. Inf. Theory.

[5]  Ned Glick,et al.  Additive estimators for probabilities of correct classification , 1978, Pattern Recognit..

[6]  Geoffrey J. McLachlan,et al.  Error rate estimation on the basis of posterior probabilities , 1980, Pattern Recognit..

[7]  D. Streeter Forest ecosystem , 1980, Nature.

[8]  A. Dawid The Well-Calibrated Bayesian , 1982 .

[9]  D. Watson A refinement of inverse distance weighted interpolation , 1985 .

[10]  G. McLachlan,et al.  Estimation of Allocation Rates in a Cluster Analysis Context , 1985 .

[11]  D. J. Hand,et al.  Recent advances in error rate estimation , 1986, Pattern Recognit. Lett..

[12]  R. Congalton Using spatial autocorrelation analysis to explore the errors in maps generated from remotely sensed data , 1988 .

[13]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[14]  Sholom M. Weiss,et al.  Small Sample Error Rate Estimation for k-NN Classifiers , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Russell G. Congalton,et al.  A review of assessing the accuracy of classifications of remotely sensed data , 1991 .

[16]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[17]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[18]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[19]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[20]  S. Running,et al.  Forest ecosystem processes at the watershed scale: Sensitivity to remotely-sensed leaf area index estimates , 1993 .

[21]  B. Kartikeyan,et al.  Contextual techniques for classification of high and low resolution remote sensing data , 1994 .

[22]  C. Woodcock,et al.  Theory and methods for accuracy assessment of thematic maps using fuzzy sets , 1994 .

[23]  P. Fisher Visualization of the reliability in classified remotely sensed images , 1994 .

[24]  Stephen V. Stehman,et al.  Thematic map accuracy assessment from the perspective of finite population sampling , 1995 .

[25]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[26]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[27]  Giles M. Foody,et al.  Approaches for the production and evaluation of fuzzy land cover classifications from remotely-sensed data , 1996 .

[28]  David L. Verbyla,et al.  Optimistic bias in classification accuracy assessment , 1996 .

[29]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[30]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[31]  Stephen V. Stehman,et al.  Selecting and interpreting measures of thematic classification accuracy , 1997 .

[32]  Stephen V. Stehman,et al.  Design and Analysis for Thematic Map Accuracy Assessment: Fundamental Principles , 1998 .

[33]  J. Vogelmann,et al.  Regional Land Cover Characterization Using Landsat Thematic Mapper Data and Ancillary Data Sources , 1998 .

[34]  Roland L. Redmond,et al.  Estimation and Mapping of Misclassification Probabilities for Thematic Land Cover Maps , 1998 .

[35]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[36]  Zhiliang Zhu,et al.  Designing an accuracy assessment for a USGS regional land cover mapping program , 1999 .

[37]  Janet Franklin,et al.  A Neural Network Method for Efficient Vegetation Mapping , 1999 .

[38]  Thomas C.M. Lee A Minimum Description Length-Based Image Segmentation Procedure, and its Comparison with a Cross-Validation-Based Segmentation Procedure , 2000 .

[39]  David A. Patterson,et al.  Ideal bootstrap estimation of expected prediction error for k-nearest neighbor classifiers: Applications for classification and error assessment , 2000, Stat. Comput..

[40]  Brian M. Steele,et al.  Combining Multiple Classifiers: An Application Using Spatial and Remotely Sensed Information for Land Cover Type Mapping , 2000 .

[41]  David J. Hand,et al.  Ten More Years of Error Rate Research , 2000 .

[42]  Marvin E. Bauer,et al.  Integrating Contextual Information with per-Pixel Classification for Improved Land Cover Classification , 2000 .

[43]  Stephen V. Stehman,et al.  Practical Implications of Design-Based Sampling Inference for Thematic Map Accuracy Assessment , 2000 .

[44]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[45]  B. Steele,et al.  A method of exploiting spatial information for improving classification rules: Application to the construction of polygon-based land cover maps , 2001 .

[46]  R. Redmond,et al.  Mapping vegetation across large geographic areas : Integration of remote sensing and GIS to classify multisource data , 2001 .

[47]  W. Krzanowski Data-based interval estimation of classification error rates , 2001 .

[48]  Mark A. Friedl,et al.  Using prior probabilities in decision-tree classification of remotely sensed data , 2002 .

[49]  B. Steele,et al.  Land Cover Mapping Using Combination and Ensemble Classifiers , 2002 .

[50]  Phaedon C. Kyriakidis,et al.  A geostatistical approach for mapping thematic classification accuracy and evaluating the impact of inaccurate spatial data on ecological model predictions , 2001, Environmental and Ecological Statistics.

[51]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[52]  Theofanis Sapatinas,et al.  Discriminant Analysis and Statistical Pattern Recognition , 2005 .

[53]  K. Roeder,et al.  Journal of the American Statistical Association: Comment , 2006 .