Automatic Building Extraction from Google Earth Images under Complex Backgrounds Based on Deep Instance Segmentation Network

Building damage accounts for a high percentage of post-natural disaster assessment. Extracting buildings from optical remote sensing images is of great significance for natural disaster reduction and assessment. Traditional methods mainly are semi-automatic methods which require human-computer interaction or rely on purely human interpretation. In this paper, inspired by the recently developed deep learning techniques, we propose an improved Mask Region Convolutional Neural Network (Mask R-CNN) method that can detect the rotated bounding boxes of buildings and segment them from very complex backgrounds, simultaneously. The proposed method has two major improvements, making it very suitable to perform building extraction task. Firstly, instead of predicting horizontal rectangle bounding boxes of objects like many other detectors do, we intend to obtain the minimum enclosing rectangles of buildings by adding a new term: the principal directions of the rectangles θ. Secondly, a new layer by integrating advantages of both atrous convolution and inception block is designed and inserted into the segmentation branch of the Mask R-CNN to make the branch to learn more representative features. We test the proposed method on a newly collected large Google Earth remote sensing dataset with diverse buildings and very complex backgrounds. Experiments demonstrate that it can obtain promising results.

[1]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Lucian Chiroiu,et al.  Damage Assessment of the 2003 Bam, Iran, Earthquake Using Ikonos Imagery , 2005 .

[4]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[5]  Qingjie Liu,et al.  Road Extraction by Deep Residual U-Net , 2017, IEEE Geoscience and Remote Sensing Letters.

[6]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Nikos Komodakis,et al.  Building detection in very high resolution multispectral data with deep learning features , 2015, 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[8]  Yongyang Xu,et al.  Building Extraction in Very High Resolution Remote Sensing Imagery Using Deep Learning and Guided Filters , 2018, Remote. Sens..

[9]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[12]  Menglong Yan,et al.  Automatic Ship Detection in Remote Sensing Images from Google Earth of Complex Scenes Based on Multiscale Rotation Dense Feature Pyramid Networks , 2018, Remote. Sens..

[13]  Geoffrey E. Hinton,et al.  Learning to Label Aerial Images from Noisy Data , 2012, ICML.

[14]  Wei Lee Woon,et al.  Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks , 2017 .

[15]  Wei Wang,et al.  CNN based suburban building detection using monocular high resolution Google Earth images , 2016, 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[16]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[17]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[18]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[19]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[20]  Shamsul Fakhruddin,et al.  UNISDR Science and Technology Conference on the Implementation of the Sendai Framework for Disaster Risk Reduction 2015-2030 , 2016 .

[21]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[22]  Yunhong Wang,et al.  Receptive Field Block Net for Accurate and Fast Object Detection , 2017, ECCV.

[23]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Wang Jianhua,et al.  A Survey of Building Extraction Methods from Optical High Resolution Remote Sensing Imagery , 2016 .

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Р Ю Чуйков,et al.  Обнаружение транспортных средств на изображениях загородных шоссе на основе метода Single shot multibox Detector , 2017 .

[29]  Xuejin Chen,et al.  HF-FCN: Hierarchically Fused Fully Convolutional Network for Robust Building Extraction , 2016, ACCV.

[30]  Stuart P. D. Gill,et al.  A Comprehensive Analysis of Building Damage in the 12 January 2010 Mw7 Haiti Earthquake Using High-Resolution Satellite and Aerial Imagery , 2011 .

[31]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[32]  Qi Wen,et al.  Quantifying Disaster Physical Damage Using Remote Sensing Data—A Technical Work Flow and Case Study of the 2014 Ludian Earthquake in China , 2017, International Journal of Disaster Risk Science.

[33]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[34]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Menglong Yan,et al.  Building extraction from remote sensing images with deep learning in a supervised manner , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[36]  Wei Yuan,et al.  Automatic Building Segmentation of Aerial Imagery Using Multi-Constraint Fully Convolutional Networks , 2018, Remote. Sens..