Effect of Label Noise on the Machine-Learned Classification of Earthquake Damage

Automated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the use of imprecise digital labeling tools and crowdsourced volunteers who are not adequately trained on or invested in the task. The spatial nature of remote sensing classification leads to the consistent mislabeling of classes that occur in close proximity to rubble, which is a major byproduct of earthquake damage in urban areas. In this study, we look at how mislabeled training data, or label noise, impact the quality of rubble classifiers operating on high-resolution remotely-sensed images. We first study how label noise dependent on geospatial proximity, or geospatial label noise, compares to standard random noise. Our study shows that classifiers that are robust to random noise are more susceptible to geospatial label noise. We then compare the effects of label noise on both pixel- and object-based remote sensing classification paradigms. While object-based classifiers are known to outperform their pixel-based counterparts, this study demonstrates that they are more susceptible to geospatial label noise. We also introduce a new labeling tool to enhance precision and image coverage. This work has important implications for the Sendai framework as autonomous damage classification will ensure rapid disaster assessment and contribute to the minimization of disaster risk.

[1]  Mario Chica-Olmo,et al.  An assessment of the effectiveness of a random forest classifier for land-cover classification , 2012 .

[2]  Bardan Ghimire,et al.  An Evaluation of Bagging, Boosting, and Random Forests for Land-Cover Classification in Cape Cod, Massachusetts, USA , 2012 .

[3]  G. Foody Assessing the accuracy of land cover change with imperfect ground reference data , 2010 .

[4]  Thomas Blaschke,et al.  Geographic Object-Based Image Analysis – Towards a new paradigm , 2014, ISPRS journal of photogrammetry and remote sensing : official publication of the International Society for Photogrammetry and Remote Sensing.

[5]  J. Strobl,et al.  Object-Oriented Image Processing in an Integrated GIS/Remote Sensing Environment and Perspectives for Environmental Applications , 2000 .

[6]  Prashanth Reddy Marpu,et al.  Geographic object-based image analysis , 2009 .

[7]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[8]  R. Chhikara,et al.  Linear discriminant analysis with misallocation in training samples , 1984 .

[9]  Lucy Bastin,et al.  The Sensitivity of Mapping Methods to Reference Data Quality: Training Supervised Image Classifications with Imperfect Reference Data , 2016, ISPRS Int. J. Geo Inf..

[10]  Emmanuelle Gouillart,et al.  scikit-image: image processing in Python , 2014, PeerJ.

[11]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[12]  Umaa Rebbapragada,et al.  Object-based classification of earthquake damage from high-resolution optical imagery using machine learning , 2016 .

[13]  Arno Schäpe,et al.  Multiresolution Segmentation : an optimization approach for high quality multi-scale image segmentation , 2000 .

[14]  Giles M. Foody,et al.  The impact of imperfect ground reference data on the accuracy of land cover change estimation , 2009 .

[15]  Bin Jiang,et al.  Crowdsourcing, Citizen Science or Volunteered Geographic Information? The Current State of Crowdsourced Geographic Information , 2016, ISPRS Int. J. Geo Inf..

[16]  Stuart P. D. Gill,et al.  A Comprehensive Analysis of Building Damage in the 12 January 2010 Mw7 Haiti Earthquake Using High-Resolution Satellite and Aerial Imagery , 2011 .

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  Johannes R. Sveinsson,et al.  Random Forests for land cover classification , 2006, Pattern Recognit. Lett..

[19]  Sassan Saatchi,et al.  The use of decision tree and multiscale texture for classification of JERS-1 SAR data over tropical forest , 2000, IEEE Trans. Geosci. Remote. Sens..

[20]  Haiqing Xu,et al.  Urban building damage detection from very high resolution imagery using one-class SVM and spatial relations , 2009, 2009 IEEE International Geoscience and Remote Sensing Symposium.

[21]  Gülsen Taskin Kaya,et al.  Damage Assessment of 2010 Haiti Earthquake with Post-Earthquake Satellite Image by Support Vector Selection and Adaptation , 2011 .

[22]  Albert Yu-Min Lin,et al.  Limitations of crowdsourcing using the EMS-98 scale in remote disaster sensing , 2014, 2014 IEEE Aerospace Conference.