Building segmentation of remote sensing images using deep neural networks and domain transform CRF

Automatic building segmentation from remote sensing images is critical in the remote sensing image semantic segmentation. The success of deep neural networks has led to advances in using fully convolutional neural networks (FCN) to extract buildings from the high-resolution image. However, the downsampling processing inevitably leads to loss of details of the segmentation results. To solve this problem, some methods try to refine the results of FCN by using probability graph models such as fully connected CRF (Conditional Random Fields). Nevertheless, many fully connected CRF based methods are too time-consuming and not suitable for building segmentation tasks in some situations. In this paper, we propose a novel time-efficient end-to-end CRF model with the domain transform algorithm called DT-CRF. In the proposed model, in order to accelerate the message passing in the mean-field approximate inference algorithm, we take the edge maps as the joint image for DT-CRF and use the domain transformation algorithm to calculate the pair-wise potential instead of the Gaussian kernel function. Meanwhile, we design a multi-task network which can generate masks and edges simultaneously, and the network can make the DT-CRF to easily optimize the segmentation results using model information. The evaluation of remote sensing image datasets verifies the time and space efficiency of the proposed DTCRF and demonstrates a distinct improvement.

[1]  Wei Lee Woon,et al.  Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks , 2017 .

[2]  Pierre Alliez,et al.  Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[3]  Jonathan T. Barron,et al.  Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Weijia Li,et al.  Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data , 2019, Remote. Sens..

[6]  Geoffrey E. Hinton,et al.  Machine Learning for Aerial Image Labeling , 2013 .

[7]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Meng Lu,et al.  Fully Convolutional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[9]  Sildomar T. Monteiro,et al.  Dense Semantic Labeling of Very-High-Resolution Aerial Imagery and LiDAR with Fully-Convolutional Neural Networks and Higher-Order CRFs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  Ryosuke Shibasaki,et al.  A Boundary Regulated Network for Accurate Roof Segmentation and Outline Extraction , 2018, Remote. Sens..

[11]  Xiao Xiang Zhu,et al.  RiFCN: Recurrent Network in Fully Convolutional Network for Semantic Segmentation of High Resolution Remote Sensing Images , 2018, ArXiv.

[12]  Leonardo Vanneschi,et al.  Improved Fully Convolutional Network with Conditional Random Fields for Building Extraction , 2018, Remote. Sens..

[13]  Roberto Cipolla,et al.  Convolutional CRFs for Semantic Segmentation , 2018, BMVC.

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Shiyong Cui,et al.  Building Footprint Extraction From VHR Remote Sensing Images Combined With Normalized DSMs Using Fused Fully Convolutional Networks , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[16]  Jiaojiao Tian,et al.  Modified Superpixel Segmentation for Digital Surface Model Refinement and Building Extraction from Satellite Stereo Imagery , 2018, Remote. Sens..

[17]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[18]  Alexey Shvets,et al.  TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation , 2018, Computer-Aided Analysis of Gastrointestinal Videos.

[19]  Jian Sun,et al.  Instance-Aware Semantic Segmentation via Multi-task Network Cascades , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Manuel M. Oliveira,et al.  Domain transform for edge-aware image and video processing , 2011, SIGGRAPH 2011.

[21]  Yuhao Wang,et al.  Dense Semantic Labeling with Atrous Spatial Pyramid Pooling and Decoder for High-Resolution Remote Sensing Imagery , 2018, Remote. Sens..

[22]  Wei Yuan,et al.  Automatic Building Segmentation of Aerial Imagery Using Multi-Constraint Fully Convolutional Networks , 2018, Remote. Sens..

[23]  Motaz El-Saban,et al.  Automatic Pixelwise Object Labeling for Aerial Imagery Using Stacked U-Nets , 2018, ArXiv.

[24]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Yongyang Xu,et al.  Building Extraction in Very High Resolution Remote Sensing Imagery Using Deep Learning and Guided Filters , 2018, Remote. Sens..

[26]  Iasonas Kokkinos,et al.  UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Bertrand Le Saux,et al.  Beyond RGB: Very High Resolution Urban Remote Sensing With Multimodal Deep Networks , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[28]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[29]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[30]  Ian D. Reid,et al.  RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Tingting Lv,et al.  Detecting Building Edges from High Spatial Resolution Remote Sensing Imagery Using Richer Convolution Features Network , 2018, Remote. Sens..

[32]  Xiang Li,et al.  Building-A-Nets: Robust Building Extraction From High-Resolution Remote Sensing Images With Adversarial Networks , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.