SDFCNv2: An Improved FCN Framework for Remote Sensing Images Semantic Segmentation

Semantic segmentation is a fundamental task in remote sensing image analysis (RSIA). Fully convolutional networks (FCNs) have achieved state-of-the-art performance in the task of semantic segmentation of natural scene images. However, due to distinctive differences between natural scene images and remotely-sensed (RS) images, FCN-based semantic segmentation methods from the field of computer vision cannot achieve promising performances on RS images without modifications. In previous work, we proposed an RS image semantic segmentation framework SDFCNv1, combined with a majority voting postprocessing method. Nevertheless, it still has some drawbacks, such as small receptive field and large number of parameters. In this paper, we propose an improved semantic segmentation framework SDFCNv2 based on SDFCNv1, to conduct optimal semantic segmentation on RS images. We first construct a novel FCN model with hybrid basic convolutional (HBC) blocks and spatial-channel-fusion squeeze-and-excitation (SCFSE) modules, which occupies a larger receptive field and fewer network model parameters. We also put forward a data augmentation method based on spectral-specific stochastic-gamma-transform-based (SSSGT-based) during the model training process to improve generalizability of our model. Besides, we design a mask-weighted voting decision fusion postprocessing algorithm for image segmentation on overlarge RS images. We conducted several comparative experiments on two public datasets and a real surveying and mapping dataset. Extensive experimental results demonstrate that compared with the SDFCNv1 framework, our SDFCNv2 framework can increase the mIoU metric by up to 5.22% while only using about half of parameters.

[1]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[2]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[3]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[4]  Rob H. G. Jongman,et al.  A high-resolution bioclimate map of the world: a unifying framework for global biodiversity research and monitoring , 2013 .

[5]  Thorsten Hoeser,et al.  Object Detection and Image Segmentation with Deep Learning on Earth Observation Data: A Review-Part I: Evolution and Recent Trends , 2020, Remote. Sens..

[6]  Matteo Matteucci,et al.  Deep Learning for Land Use and Land Cover Classification Based on Hyperspectral and Multispectral Earth Observation Data: A Review , 2020, Remote. Sens..

[7]  Gang Liu,et al.  A Color-Texture-Structure Descriptor for High-Resolution Satellite Image Classification , 2016, Remote. Sens..

[8]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[9]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[10]  Eija Honkavaara,et al.  A Novel Deep Learning Method to Identify Single Tree Species in UAV-Based Hyperspectral Images , 2020, Remote. Sens..

[11]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[12]  Hongzhang Xu,et al.  Deep learning in environmental remote sensing: Achievements and challenges , 2020, Remote Sensing of Environment.

[13]  Neil Flood,et al.  Using a U-net convolutional neural network to map woody vegetation extent from high resolution satellite imagery across Queensland, Australia , 2019, Int. J. Appl. Earth Obs. Geoinformation.

[14]  Yassine Ruichek,et al.  Survey on semantic segmentation using deep learning techniques , 2019, Neurocomputing.

[15]  Peter Caccetta,et al.  ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data , 2019, ISPRS Journal of Photogrammetry and Remote Sensing.

[16]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[17]  Ronan Sicre,et al.  Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.

[18]  Xinbo Gao,et al.  Single satellite image dehazing via linear intensity transformation and local property analysis , 2016, Neurocomputing.

[19]  Xueliang Zhang,et al.  Deep learning in remote sensing applications: A meta-analysis and review , 2019, ISPRS Journal of Photogrammetry and Remote Sensing.

[20]  Meng Lan,et al.  Global context based automatic road segmentation via dilated convolutional neural network , 2020, Inf. Sci..

[21]  Dong Liu,et al.  High-Resolution Representations for Labeling Pixels and Regions , 2019, ArXiv.

[22]  Sarah Taylor Lovell,et al.  Mapping public and private spaces of urban agriculture in Chicago through the analysis of high-resolution aerial images in Google Earth , 2012 .

[23]  Yunhong Wang,et al.  Receptive Field Block Net for Accurate and Fast Object Detection , 2017, ECCV.

[24]  Yuei-An Liou,et al.  Land-Use Land-Cover Classification by Machine Learning Classifiers for Satellite Observations - A Review , 2020, Remote. Sens..

[25]  S D Walter,et al.  A reappraisal of the kappa coefficient. , 1988, Journal of clinical epidemiology.

[26]  Eric W. Gill,et al.  A new fully convolutional neural network for semantic segmentation of polarimetric SAR imagery in complex land cover ecosystem , 2019, ISPRS Journal of Photogrammetry and Remote Sensing.

[27]  Mi Zhang,et al.  Learning Dual Multi-Scale Manifold Ranking for Semantic Segmentation of High-Resolution Images , 2017, Remote. Sens..

[28]  Yu Liu,et al.  An Ensemble Learning Approach for Urban Land Use Mapping Based on Remote Sensing Imagery and Social Sensing Data , 2020, Remote. Sens..