EFFECT OF THE TRAINING SET CONFIGURATION ON SENTINEL-2-BASED URBAN LOCAL CLIMATE ZONE CLASSIFICATION

Abstract. As any supervised classification procedure, also Local Climate Zone (LCZ) mapping requires reliable reference data. These are usually created manually and inevitably include label noise, caused by the complexity of the LCZ class scheme as well as variations in cultural and physical environmental factors. This study aims at evaluating the impact of the training set configuration, i.e. training sample number and imbalance, on the performance of Canonical Correlation Forests (CCFs) for a classification of the 11 urban LCZ classes. Experiments are carried out based on globally available Sentinel-2 imagery. Besides multi-spectral observations, different index measures extracted from the images as well as the Global Urban Footprint (GUF) and Open Street Map (OSM) layers are fed into the CCFs classifier. The results show that different LCZs favor different configurations in terms of training sample number and balance. Based on the findings, majority voting of different predictions from different configurations is proposed and performed. This way, a significant accuracy improvement can be achieved.

[1]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[2]  Naoto Yokoya,et al.  Multimodal, multitemporal, and multisource global data fusion for local climate zones classification based on ensemble learning , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[3]  Frieke Van Coillie,et al.  Quality of Crowdsourced Data on Urban Morphology—The Human Influence Experiment (HUMINEX) , 2017 .

[4]  Frank D. Wood,et al.  Canonical Correlation Forests , 2015, ArXiv.

[5]  T. Oke,et al.  Local Climate Zones for Urban Temperature Studies , 2012 .

[6]  Iain Stewart,et al.  Mapping Local Climate Zones for a Worldwide Database of the Form and Function of Cities , 2015, ISPRS Int. J. Geo Inf..

[7]  Julien Radoux,et al.  Sentinel-2's Potential for Sub-Pixel Landscape Feature Detection , 2016, Remote. Sens..

[8]  Wei You,et al.  Detecting the Boundaries of Urban Areas in India: A Dataset for Pixel-Based Image Classification in Google Earth Engine , 2016, Remote. Sens..

[9]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Effect of label noise in the complexity of classification problems , 2015, Neurocomputing.

[10]  C. Tucker Red and photographic infrared linear combinations for monitoring vegetation , 1979 .

[11]  Hannes Taubenböck,et al.  How good is the map? A multi-scale cross-comparison framework for global settlement layers: Evidence from Central Europe , 2016 .

[12]  Claire Marais-Sicre,et al.  Effect of Training Class Label Noise on Classification Performances for Land Cover Mapping with Satellite Image Time Series , 2017, Remote. Sens..