Semantic Segmentation of Aerial Images With Shuffling Convolutional Neural Networks

Semantic segmentation of aerial images refers to assigning one land cover category to each pixel. This is a challenging task due to the great differences in the appearances of ground objects. Many attempts have been made during the past decades. In recent years, convolutional neural networks (CNNs) have been introduced in the remote sensing field, and various solutions have been proposed to realize dense semantic labeling with CNNs. In this letter, we propose shuffling CNNs to realize semantic segmentation of aerial images in a periodic shuffling manner. This approach is a supplement to current methods for semantic segmentation of aerial images. We propose a naive version and a deeper version of this method, and both are adept at detecting small objects. Additionally, we propose a method called field-of-view (FoV) enhancement that can enhance the predictions. This method can be applied to various networks, and our experiments verify its effectiveness. The final results are further improved through an ensemble method that averages the score maps generated by the models at different checkpoints of the same network. We evaluate our models using the ISPRS Vaihingen and Potsdam data sets, and we acquire promising results using these two data sets.

[1]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[2]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[3]  Xiaoqiang Lu,et al.  Remote Sensing Image Scene Classification: Benchmark and State of the Art , 2017, Proceedings of the IEEE.

[4]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[5]  Jamie Sherrah,et al.  Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery , 2016, ArXiv.

[6]  Michele Volpi,et al.  Dense Semantic Labeling of Subdecimeter Resolution Images With Convolutional Neural Networks , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[7]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks , 2014, IEEE Geoscience and Remote Sensing Letters.

[8]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Xueming Qian,et al.  Semantic Annotation of High-Resolution Satellite Images via Weakly Supervised Learning , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[11]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Uwe Stilla,et al.  SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS , 2016 .

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Chunhong Pan,et al.  Building extraction from multi-source remote sensing images via deep deconvolution neural networks , 2016, 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[16]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[17]  Markus Gerke,et al.  Use of the stair vision library within the ISPRS 2D semantic labeling benchmark (Vaihingen) , 2014 .

[18]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).