论文信息 - Context Aggregation Network for Semantic Labeling in Aerial Images

Context Aggregation Network for Semantic Labeling in Aerial Images

Multi-scale object recognition and accurate object localization are two major problems for semantic segmentation in high resolution aerial images. To handle these problems, we design a Context Fuse Module to aggregate multi-scale features and propose an Attention Mix Module to combine different level features for higher localization accuracy. We further employ a Residual Convolutional Module to refine features in all levels. Based on these modules, we construct a new end-to-end network for semantic labeling in aerial images. Experiments demonstrate that our network outperforms other state-of-the-art models on the large-scale ISPRS Vaihingen 2D Semantic Labeling Challenge dataset. The model implementation code is made publicly available1.

Wensheng Cheng | Yu Cheng | Wen Yang | Youqi Pan | Haowen Guo

[1] Xiaogang Wang,et al. Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Jocelyn Chanussot,et al. Learning to semantically segment high-resolution remote sensing images , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[3] Bastian Leibe,et al. Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Xiaogang Wang,et al. Context Encoding for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5] Xiangyu Zhang,et al. Large Kernel Matters — Improve Semantic Segmentation by Global Convolutional Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Jamie Sherrah,et al. Semantic Labeling of Aerial and Satellite Imagery , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[7] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[9] Gang Yu,et al. Learning a Discriminative Feature Network for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10] Ian D. Reid,et al. RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[12] 한보형,et al. Learning Deconvolution Network for Semantic Segmentation , 2015 .

[13] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14] George Papandreou,et al. Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[15] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[17] Garrison W. Cottrell,et al. Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18] Lingfeng Wang,et al. Context-aware cascade network for semantic labeling in VHR image , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[19] Iasonas Kokkinos,et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[20] Wei Liu,et al. ParseNet: Looking Wider to See Better , 2015, ArXiv.

[21] Gang Sun,et al. Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22] George Papandreou,et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.