Distance to second cluster as a measure of classification confidence

Most image classification algorithms rely on computing the distance between the unique spectral signature of a given pixel and a set of possible clusters within an n-dimensional feature space that represents discrete land cover categories. Each scrutinized pixel will ultimately be closest to one of the predefined clusters; different classification algorithms differ in the details of which cluster is considered as closest or most likely, but in general the selected algorithm will label each pixel with the label of the closest cluster. However, pixels expressing virtually identical distances to two or more clusters identify a limitation of this typical classification approach. Conditions for limitations to distance based classification algorithms include when distances are long and the pixel may not clearly belong to any single category, may represent mixed land cover, or can be easily confused spectrally between two or more categories. We propose that retention of the distance to the second closest cluster can shed light on the confidence with which label assignment proceeds and present several examples of how such additional information might enhance accuracy assessments and improve classification confidence. The method was developed with simplicity as a goal, assuming the classification has already been performed, and standard clustering reports are available. Over a test site in central British Columbia, Canada, we illustrate the described technique using classified image data from a nation-wide land cover mapping project. Calculation of multi-spectral Euclidean distances to cluster centroids, standardized by cluster variance, allows comparison of all potential class assignments within a unified framework. The variable distances provide a measure of relative confidence in the actual classification at the level of individual pixels.

[1]  Stephen V. Stehman,et al.  Statistical Rigor and Practical Utility in Thematic Map Accuracy Assessment , 2001 .

[2]  Giles M. Foody,et al.  Directed ground survey for improved maximum likelihood classification of remotely sensed data , 1990 .

[3]  John A. Richards,et al.  Remote Sensing Digital Image Analysis , 1986 .

[4]  Brandt Tso,et al.  A contextual classification scheme based on MRF model with improved parameter estimation and multiscale fuzzy line process , 2005 .

[5]  A. Setzer,et al.  Spectral characteristics of fire scars in Landsat-5 TM images of Amazonia , 1993 .

[6]  Farid Melgani,et al.  An explicit fuzzy supervised classification method for multispectral remote sensing images , 2000, IEEE Trans. Geosci. Remote. Sens..

[7]  Michael A. Wulder,et al.  Remote sensing methods in medium spatial resolution satellite data land cover classification of large areas , 2002 .

[8]  Michael A. Wulder,et al.  Validation of a large area land cover product using purpose-acquired airborne video , 2007 .

[9]  Manish Sarkar,et al.  Fuzzy-rough nearest neighbors algorithm , 2000, Smc 2000 conference proceedings. 2000 ieee international conference on systems, man and cybernetics. 'cybernetics evolving to systems, humans, organizations, and their complex interactions' (cat. no.0.

[10]  Graciela Metternicht,et al.  Categorical fuzziness: a comparison between crisp and fuzzy class boundary modelling for mapping salt-affected soils using Landsat TM data and a classification based on anion ratios , 2003 .

[11]  Pi-Fuei Hsieh,et al.  Effect of spatial resolution on classification errors of pure and mixed pixels in remote sensing , 2001, IEEE Trans. Geosci. Remote. Sens..

[12]  John A. Richards,et al.  Classifier performance and map accuracy , 1996 .

[13]  J. Cihlar Land cover mapping of large areas from satellites: Status and research priorities , 2000 .

[14]  Stephen V. Stehman,et al.  Selecting and interpreting measures of thematic classification accuracy , 1997 .

[15]  Antonio J. Plaza,et al.  A quantitative and comparative analysis of endmember extraction algorithms from hyperspectral data , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[16]  Ian Olthof,et al.  Mapping deciduous forest ice storm damage using Landsat and environmental data , 2004 .

[17]  Zhongbiao Chen,et al.  A new process for the segmentation of high resolution remote sensing imagery , 2006 .

[18]  Janet Franklin,et al.  A Neural Network Method for Efficient Vegetation Mapping , 1999 .

[19]  T. M. Lillesand,et al.  Remote Sensing and Image Interpretation , 1980 .

[20]  J. Wickham,et al.  Thematic accuracy of the 1992 National Land-Cover Data for the eastern United States: Statistical methodology and regional results , 2003 .

[21]  Paul M. Mather,et al.  The use of backpropagating artificial neural networks in land cover classification , 2003 .

[22]  Russell G. Congalton,et al.  A review of assessing the accuracy of classifications of remotely sensed data , 1991 .

[23]  Geoff Smith,et al.  An evaluation of per-parcel land cover mapping using maximum likelihood class probabilities , 2003 .

[24]  Ignazio Gallo,et al.  A cognitive pyramid for contextual classification of remote sensing images , 2003, IEEE Trans. Geosci. Remote. Sens..

[25]  Roland L. Redmond,et al.  Estimation and Mapping of Misclassification Probabilities for Thematic Land Cover Maps , 1998 .

[26]  John A. Richards,et al.  Remote Sensing Digital Image Analysis: An Introduction , 1999 .

[27]  David A. Landgrebe,et al.  MultiSpec: a tool for multispectral--hyperspectral image data analysis , 2002 .

[28]  Giles M. Foody,et al.  Status of land cover classification accuracy assessment , 2002 .

[29]  Scott Mitchell,et al.  Integration of forest inventory and satellite imagery:: a Canadian status assessment and research issues , 2005 .

[30]  Ronald J. Hall,et al.  Operational mapping of the land cover of the forested area of Canada with Landsat data: EOSD land cover program , 2003 .

[31]  Paul V. Bolstad,et al.  Improved classification of forest vegetation in northern Wisconsin through a rule-based combination of soils, terrain, and Landsat Thematic Mapper data , 1992 .

[32]  F. Maselli,et al.  Use of probability entropy for the estimation and graphical representation of the accuracy of maximum likelihood classifications , 1994 .

[33]  M. Wulder,et al.  Contextual classification of Landsat TM images to forest inventory cover types , 2004 .

[34]  Giles M. Foody,et al.  Incorporating mixed pixels in the training, allocation and testing stages of supervised classifications , 1996, Pattern Recognit. Lett..

[35]  W. B. Yates,et al.  Classification of remotely sensed data by an artificial neural network: issues related to training data characteristics , 1995 .

[36]  M. Gillis Canada's National Forest Inventory (Responding to Current Information Needs) , 2001, Environmental monitoring and assessment.

[37]  Ludovic Roux,et al.  A fuzzy-possibilistic scheme of study for objects with indeterminate boundaries: application to French Polynesian reefscapes , 2000, IEEE Trans. Geosci. Remote. Sens..

[38]  Tarmo K. Remmel,et al.  Confidence in coincidence , 2006 .

[39]  Fangju Wang,et al.  Fuzzy supervised classification of remote sensing images , 1990 .

[40]  C. Özkan,et al.  Comparison of maximum likelihood classification method with supervised artificial neural network algorithms for land use activities , 2004 .

[41]  N. Campbell,et al.  Derivation and applications of probabilistic measures of class membership from the maximum-likelihood classification , 1992 .

[42]  F. J. García-Haro,et al.  A Mixture Modeling Approach to Estimate Vegetation Parameters for Heterogeneous Canopies in Remote Sensing , 2000 .

[43]  K. M. Reddy,et al.  Performance analysis of IRS-bands for land use/land cover classification system using Maximum Likelihood Classifier , 1996 .