Segmentation of Aerial Image with Multi-scale Feature and Attention Model

Aerial image labeling plays an important part in the mapping of maps with high precision. The knowledge about the range and intensive degree of aerial building segmentation is necessary for urban planning. Fully convolutional networks (FCNs) have recently shown state-of-the-art performance in image segmentation. In order to get better aerial images segmentation performance, we use a method of combing FCNs with multi-scale features and attention model in order to carry out segmentation automatically in aerial images. Attention model gives each scale feature added extra supervision to achieve better segmentation. Here, U-net and FCN-8s are used as original semantic segmentation model to train with multi-scale images and attention models. The datasets use different proportions of Inria Aerial Image Labeling Dataset, including two semantic classes: building and not building. The results show that the semantic segmentation model combined with multi-scale features and attention model has higher segmentation accuracy and better performance.

[1]  Ronan Collobert,et al.  Recurrent Convolutional Neural Networks for Scene Parsing , 2013, ArXiv.

[2]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Alexandre Boulch,et al.  Benchmarking classification of earth-observation data: From learning explicit features to convolutional networks , 2015, 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[4]  Yi Yang,et al.  Attention to Scale: Scale-Aware Semantic Image Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[9]  Wei Xu,et al.  Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[11]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.