Analysis of Encoder-Decoder Based Deep Learning Architectures for Semantic Segmentation in Remote Sensing Images

Semantic segmentation in remote sensing images is a very challenging task. Each pixel in a remote sensing image has a semantic meaning to it and automatic annotation of each pixel remains as an open challenge for the research community due to its high spatial resolution. To address this issue deep learning based encoder-decoder architectures like SegNet and ResNet that is widely used for computer vision dataset is adopted for remote sensing images and its performance is analyzed based on the pixel wise classification accuracy. From the experiment conducted it is inferred that SegNet suffers from degradation problem when the depth of the network is increased with an overall accuracy of about 86.086% whereas the Residual network manages to overcome the degradation effect with an overall accuracy of about 87.747%.

[1]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Bertrand Le Saux,et al.  Beyond RGB: Very High Resolution Urban Remote Sensing With Multimodal Deep Networks , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[3]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Atsuto Maki,et al.  A systematic study of the class imbalance problem in convolutional neural networks , 2017, Neural Networks.

[6]  Xinchang Zhang,et al.  Developing a multi-filter convolutional neural network for semantic segmentation using high-resolution aerial imagery and LiDAR data , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[7]  Weiwei Sun,et al.  Fully Convolutional Networks for Semantic Segmentation of Very High Resolution Remotely Sensed Images Combined With DSM , 2018, IEEE Geoscience and Remote Sensing Letters.

[8]  Michael Cramer,et al.  The DGPF-Test on Digital Airborne Camera Evaluation - Over- view and Test Design , 2010 .

[9]  Jamie Sherrah,et al.  Semantic Labeling of Aerial and Satellite Imagery , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[10]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Ronald Kemker,et al.  Algorithms for semantic segmentation of multispectral remote sensing imagery using deep learning , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[12]  Bertrand Le Saux,et al.  How useful is region-based classification of remote sensing images in a deep learning framework? , 2016, 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[13]  Bertrand Le Saux,et al.  Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks , 2016, ACCV.

[14]  Bertrand Le Saux,et al.  Fusion of heterogeneous data in convolutional networks for urban semantic labeling , 2017, 2017 Joint Urban Remote Sensing Event (JURSE).