Gated Convolutional Neural Network for Semantic Segmentation in High-Resolution Images

Semantic segmentation is a fundamental task in remote sensing image processing. The large appearance variations of ground objects make this task quite challenging. Recently, deep convolutional neural networks (DCNNs) have shown outstanding performance in this task. A common strategy of these methods (e.g., SegNet) for performance improvement is to combine the feature maps learned at different DCNN layers. However, such a combination is usually implemented via feature map summation or concatenation, indicating that the features are considered indiscriminately. In fact, features at different positions contribute differently to the final performance. It is advantageous to automatically select adaptive features when merging different-layer feature maps. To achieve this goal, we propose a gated convolutional neural network to fulfill this task. Specifically, we explore the relationship between the information entropy of the feature maps and the label-error map, and then a gate mechanism is embedded to integrate the feature maps more effectively. The gate is implemented by the entropy maps, which are generated to assign adaptive weights to different feature maps as their relative importance. Generally, the entropy maps, i.e., the gates, guide the network to focus on the highly-uncertain pixels, where detailed information from lower layers is required to improve the separability of these pixels. The selected features are finally combined to feed into the classifier layer, which predicts the semantic label of each pixel. The proposed method achieves competitive segmentation accuracy on the public ISPRS 2D Semantic Labeling benchmark, which is challenging for segmentation by only using the RGB images.

[1]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Bo Du,et al.  Deep Learning for Remote Sensing Data: A Technical Tutorial on the State of the Art , 2016, IEEE Geoscience and Remote Sensing Magazine.

[3]  Qi Wang,et al.  Salient Band Selection for Hyperspectral Image Classification via Manifold Ranking , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Eli Saber,et al.  Classification of remote sensed images using random forests and deep learning framework , 2016, Remote Sensing.

[6]  Bertrand Le Saux,et al.  How useful is region-based classification of remote sensing images in a deep learning framework? , 2016, 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[7]  K. Seto,et al.  Mapping urbanization dynamics at regional and global scales using multi-temporal DMSP/OLS nighttime light data , 2011 .

[8]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[9]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Qi Wang,et al.  Dual-Clustering-Based Hyperspectral Band Selection by Contextual Analysis , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[12]  Jamie Sherrah,et al.  Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery , 2016, ArXiv.

[13]  Jamie Sherrah,et al.  Effective semantic pixel labelling with convolutional networks and Conditional Random Fields , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[14]  Bertrand Le Saux,et al.  Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks , 2016, ACCV.

[15]  Ying Wang,et al.  Accurate urban road centerline extraction from VHR imagery via multiscale segmentation and tensor voting , 2015, Neurocomputing.

[16]  Shuo Yang,et al.  From Facial Parts Responses to Face Detection: A Deep Learning Approach , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[18]  Guosheng Lin,et al.  Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[20]  Luca Benini,et al.  Deep structured features for semantic segmentation , 2016, 2017 25th European Signal Processing Conference (EUSIPCO).

[21]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[22]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Marius Leordeanu,et al.  Dual Local-Global Contextual Pathways for Recognition in Aerial Imagery , 2016, ArXiv.

[25]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[26]  Uwe Stilla,et al.  Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection , 2016, ISPRS Journal of Photogrammetry and Remote Sensing.

[27]  Ian D. Reid,et al.  RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[29]  Jon Atli Benediktsson,et al.  A Survey on Spectral–Spatial Classification Techniques Based on Attribute Profiles , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[30]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[31]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[32]  Michael Kampffmeyer,et al.  Semantic Segmentation of Small Objects and Modeling of Uncertainty in Urban Remote Sensing Images Using Deep Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[33]  Liangpei Zhang,et al.  Urban Change Analysis with Multi-Sensor Multispectral Imagery , 2017, Remote. Sens..

[34]  Lorenzo Bruzzone,et al.  A Review of Modern Approaches to Classification of Remote Sensing Data , 2014 .

[35]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[36]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[37]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[38]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Leena Matikainen,et al.  Segment-Based Land Cover Mapping of a Suburban Area - Comparison of High-Resolution Remotely Sensed Datasets Using Classification Trees and Test Field Points , 2011, Remote. Sens..

[40]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[41]  Xiaogang Wang,et al.  Crafting GBD-Net for Object Detection , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Qi Wang,et al.  Hyperspectral Image Classification via Multitask Joint Sparse Representation and Stepwise MRF Optimization , 2016, IEEE Transactions on Cybernetics.

[43]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[44]  Philip H. S. Torr,et al.  Higher Order Conditional Random Fields in Deep Neural Networks , 2015, ECCV.