A New Accuracy Assessment Method for One-Class Remote Sensing Classification

In one-class remote sensing classification, users are only interested in classifying one specific land type (positive class), without considering other classes (negative class). Previous researchers have proposed different one-class classification methods without requiring negative data. An appropriate accuracy measure is usually needed to tune free parameters/threshold and to evaluate the classification result. However, traditional accuracy measures, such as the kappa coefficient and F-measure (F), require both positive and negative data, and hence, they are not applicable for positive-only data. In this paper, we investigate a new accuracy assessment method that does not require negative data. Two new statistics Fpb (proxy of F-measure based on positive-background data) and Fcpb (prevalence-calibrated proxy of F-measure based on positive-background data) can be calculated from a modified confusion matrix, where the observed negative data are replaced by background data. To investigate the effectiveness of the new method, we produced different one-class classification results using two scenes of aerial photograph, and the accuracy values were evaluated by Fpb, Fcpb, kappa coefficient, and F. The effectiveness of F pb in model and threshold selection was investigated as well. Experimental results show that the behaviors of Fpb, Fcpb, F, and kappa coefficient are similar, and they all rank the models by accuracy similarly. In model and threshold selection, the classification accuracy values produced by maximizing Fpb and F are similar, and they are higher than those produced by setting an arbitrary rejection fraction. Therefore, we conclude that the new method is effective in model selection, threshold selection, and accuracy assessment, and it will have important applications in one-class remote sensing classification since negative data are not needed.

[1]  R. G. Oderwald,et al.  Assessing Landsat classification accuracy using discrete multivariate analysis statistical techniques. , 1983 .

[2]  Russell G. Congalton,et al.  A review of assessing the accuracy of classifications of remotely sensed data , 1991 .

[3]  P. D. Heermann,et al.  Classification of multispectral remote sensing data using a back-propagation neural network , 1992, IEEE Trans. Geosci. Remote. Sens..

[4]  G. Imbens,et al.  Case-control studies with contaminated controls☆ , 1996 .

[5]  Vittorio Castelli,et al.  The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter , 1996, IEEE Trans. Inf. Theory.

[6]  S. Stehman Estimating the Kappa Coefficient and its Variance under Stratified Random Sampling , 1996 .

[7]  John Bell,et al.  A review of methods for the assessment of prediction errors in conservation presence/absence models , 1997, Environmental Conservation.

[8]  David A. Landgrebe,et al.  Partially supervised classification using weighted unsupervised clustering , 1999, IEEE Trans. Geosci. Remote. Sens..

[9]  John A. Richards,et al.  Remote Sensing Digital Image Analysis: An Introduction , 1999 .

[10]  Giles M. Foody,et al.  Estimation of sub-pixel land cover composition in the presence of untrained classes , 2000 .

[11]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[12]  R. Congalton Accuracy assessment and validation of remotely sensed and other spatial information , 2001 .

[13]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[14]  Giles M. Foody,et al.  Status of land cover classification accuracy assessment , 2002 .

[15]  Bing Liu,et al.  Learning with Positive and Unlabeled Examples Using Weighted Logistic Regression , 2003, ICML.

[16]  Robert P. Anderson,et al.  Evaluating predictive models of species’ distributions: criteria for selecting optimal models , 2003 .

[17]  Giles M. Foody,et al.  Supervised image classification by MLP and RBF neural networks with and without an exhaustively defined set of classes , 2004 .

[18]  T. Dawson,et al.  Modelling species distributions in Britain: a hierarchical integration of climate and land-cover data , 2004 .

[19]  Giovanna Jona Lasinio,et al.  Two statistical methods to validate habitat suitability models using presence-only data , 2004 .

[20]  S. Cherry,et al.  USE AND INTERPRETATION OF LOGISTIC REGRESSION IN HABITAT-SELECTION STUDIES , 2004 .

[21]  G. Foody Thematic map comparison: Evaluating the statistical significance of differences in classification accuracy , 2004 .

[22]  T. Dawson,et al.  Selecting thresholds of occurrence in the prediction of species distributions , 2005 .

[23]  Maggi Kelly,et al.  Support vector machines for predicting distribution of Sudden Oak Death in California , 2005 .

[24]  Éric Gaussier,et al.  A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation , 2005, ECIR.

[25]  A. Townsend Peterson,et al.  Novel methods improve prediction of species' distributions from occurrence data , 2006 .

[26]  Giles M. Foody,et al.  Training set size requirements for the classification of a specific class , 2006 .

[27]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[28]  Lorenzo Bruzzone,et al.  A Support Vector Domain Description Approach to Supervised Classification of Remote Sensing Images , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[29]  Giles M. Foody,et al.  Sanchez-Hernandez, Carolina and Boyd, Doreen S. and Foody, Giles M. (2007) One-class classification for monitoring a specific land cover class: SVDD classification of fenland. IEEE Transactions on , 2016 .

[30]  Luis Gómez-Chova,et al.  Combination of one-class remote sensing image classifiers , 2007, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[31]  Guoliang Fan,et al.  SVM-Based Data Editing for Enhanced One-Class Classification of Remotely Sensed Imagery , 2008, IEEE Geoscience and Remote Sensing Letters.

[32]  Charles Elkan,et al.  Learning classifiers from only positive and unlabeled data , 2008, KDD.

[33]  Giles M. Foody,et al.  Harshness in image classification accuracy assessment , 2008 .

[34]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[35]  G. Foody Classification accuracy comparison: hypothesis tests and the use of confidence intervals in evaluations of difference, equivalence and non-inferiority , 2009 .

[36]  Wenkai Li,et al.  A maximum entropy approach to one-class classification of remote sensing imagery , 2010 .

[37]  G. Foody Assessing the accuracy of land cover change with imperfect ground reference data , 2010 .

[38]  Wenkai Li,et al.  A Positive and Unlabeled Learning Algorithm for One-Class Classification of Remote-Sensing Data , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[39]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[40]  W. Li,et al.  Predicting potential distributions of geographic events using one-class data: concepts and methods , 2011, Int. J. Geogr. Inf. Sci..

[41]  M. White,et al.  Measuring and comparing the accuracy of species distribution models with presence–absence data , 2011 .

[42]  G. Foody Impacts of imperfect reference data on the apparent accuracy of species presence–absence models and their predictions , 2011 .

[43]  C. Elkan,et al.  Can we model the probability of presence of species without absence data , 2011 .

[44]  Q. Guo,et al.  A Framework for Supervised Image Classification with Incomplete Training Samples , 2012 .

[45]  Q. Guo,et al.  How to assess the prediction accuracy of species presence–absence models without absence data? , 2013 .

[46]  Xiuping Jia,et al.  Fuzzy Assessment of Spectral Unmixing Algorithms , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.