MAP-Net: Multiple Attending Path Neural Network for Building Footprint Extraction From Remote Sensed Imagery

Accurately and efficiently extracting building footprints from a wide range of remote sensed imagery remains a challenge due to their complex structure, variety of scales and diverse appearances. Existing convolutional neural network (CNN)-based building extraction methods are complained that they cannot detect the tiny buildings because the spatial information of CNN feature maps are lost during repeated pooling operations of the CNN, and the large buildings still have inaccurate segmentation edges. Moreover, features extracted by a CNN are always partial which restricted by the size of the respective field, and large-scale buildings with low texture are always discontinuous and holey when extracted. This paper proposes a novel multi attending path neural network (MAP-Net) for accurately extracting multiscale building footprints and precise boundaries. MAP-Net learns spatial localization-preserved multiscale features through a multi-parallel path in which each stage is gradually generated to extract high-level semantic features with fixed resolution. Then, an attention module adaptively squeezes channel-wise features from each path for optimization, and a pyramid spatial pooling module captures global dependency for refining discontinuous building footprints. Experimental results show that MAP-Net outperforms state-of-the-art (SOTA) algorithms in boundary localization accuracy as well as continuity of large buildings. Specifically, our method achieved 0.68\%, 1.74\%, 1.46\% precision, and 1.50\%, 1.53\%, 0.82\% IoU score improvement without increasing computational complexity compared with the latest HRNetv2 on the Urban 3D, Deep Globe and WHU datasets, respectively. The TensorFlow implementation is available at this https URL.

[1]  Xiang Zhou,et al.  Seamless Fusion of LiDAR and Aerial Imagery for Building Extraction , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[2]  Meng Lu,et al.  A scale robust convolutional neural network for automatic building extraction from aerial and satellite imagery , 2018, International Journal of Remote Sensing.

[3]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Motaz El-Saban,et al.  Automatic Pixelwise Object Labeling for Aerial Imagery Using Stacked U-Nets , 2018, ArXiv.

[5]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Lutz Gross,et al.  ARC-Net: An Efficient Network for Building Extraction From High-Resolution Aerial Images , 2019, IEEE Access.

[7]  Shuicheng Yan,et al.  A2-Nets: Double Attention Networks , 2018, NeurIPS.

[8]  Meng Lu,et al.  Toward Automatic Building Footprint Delineation From Aerial Images Using CNN and Regularization , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[9]  Meng Lu,et al.  Fully Convolutional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Shiyong Cui,et al.  BUILDING EXTRACTION FROM REMOTE SENSING DATA USING FULLY CONVOLUTIONAL NETWORKS , 2017 .

[11]  Dong Liu,et al.  Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Lin Lei,et al.  Multi-scale object detection in remote sensing imagery with convolutional neural networks , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[13]  Shiyong Cui,et al.  Building Footprint Extraction From VHR Remote Sensing Images Combined With Normalized DSMs Using Fused Fully Convolutional Networks , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[14]  Chunhong Pan,et al.  Building extraction from multi-source remote sensing images via deep deconvolution neural networks , 2016, 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[15]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Xiao Xiang Zhu,et al.  Deep learning in remote sensing: a review , 2017, ArXiv.

[17]  Feng Li,et al.  Fusion of Multiscale Convolutional Neural Networks for Building Extraction in Very High-Resolution Images , 2019, Remote. Sens..

[18]  Lei Zhou,et al.  Adaptive Pyramid Context Network for Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Weijia Li,et al.  Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data , 2019, Remote. Sens..

[20]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[21]  Siyang Chen,et al.  Automatic building extraction from LiDAR data fusion of point and grid-based features , 2017 .

[22]  Xinchang Zhang,et al.  Developing a multi-filter convolutional neural network for semantic segmentation using high-resolution aerial imagery and LiDAR data , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[23]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[24]  Li Wang,et al.  Fusion of images and point clouds for the semantic segmentation of large-scale 3D scenes based on deep learning , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[25]  Stephen Lin,et al.  GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[26]  I. Dowman,et al.  Data fusion of high-resolution satellite imagery and LiDAR data for automatic building extraction * , 2007 .

[27]  Jiancheng Luo,et al.  DE-Net: Deep Encoding Network for Building Extraction from High-Resolution Remote Sensing Imagery , 2019, Remote. Sens..

[28]  Jing Huang,et al.  DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29]  S. Ghosh,et al.  Automatic building footprint extraction from high-resolution satellite image using mathematical morphology , 2018 .

[30]  Bruno Vallet,et al.  Detecting blind building façades from highly overlapping wide angle aerial imagery , 2014 .

[31]  Parvaneh Saeedi,et al.  Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[32]  Hélène Oriot,et al.  Rectangular building extraction from stereoscopic airborne Radar images , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[33]  Xiaocong Xu,et al.  Building Footprint Extraction from High-Resolution Images via Spatial Residual Inception Convolutional Neural Network , 2019, Remote. Sens..

[34]  Wei Yuan,et al.  Automatic Building Segmentation of Aerial Imagery Using Multi-Constraint Fully Convolutional Networks , 2018, Remote. Sens..

[35]  S. M. Kamrul Hasan,et al.  U-NetPlus: A Modified Encoder-Decoder U-Net Architecture for Semantic and Instance Segmentation of Surgical Instruments from Laparoscopic Images , 2019, 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[36]  Dong Liu,et al.  High-Resolution Representations for Labeling Pixels and Regions , 2019, ArXiv.

[37]  C. Fraser,et al.  Automatic extraction of building roofs using LIDAR data and multispectral imagery , 2013 .

[38]  Haihong Zhu,et al.  A Multiple-Feature Reuse Network to Extract Buildings from Remote Sensing Imagery , 2018, Remote. Sens..

[39]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Gordon Christie,et al.  Urban 3D challenge: building footprint detection using orthorectified imagery and digital surface models from commercial satellites , 2018, Defense + Security.

[41]  Wei Lee Woon,et al.  Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks , 2017 .

[42]  Wenzhong Shi,et al.  Extracting Man-Made Objects From High Spatial Resolution Remote Sensing Images via Fast Level Set Evolutions , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[43]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Yuming Xiang,et al.  EU-Net: An Efficient Fully Convolutional Network for Building Extraction from Optical Remote Sensing Images , 2019, Remote. Sens..

[46]  P. Takis Mathiopoulos,et al.  A Novel Framework for 2.5-D Building Contouring From Large-Scale Residential Scenes , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[47]  Ronald Kemker,et al.  Algorithms for semantic segmentation of multispectral remote sensing imagery using deep learning , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[48]  Nikolaos Doulamis,et al.  Building Extraction From LiDAR Data Applying Deep Convolutional Neural Networks , 2019, IEEE Geoscience and Remote Sensing Letters.

[49]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[50]  Weipeng Jing,et al.  ESFNet: Efficient Network for Building Extraction From High-Resolution Aerial Images , 2019, IEEE Access.

[51]  Guojun Lu,et al.  An Automatic Building Extraction and Regularisation Technique Using LiDAR Point Cloud Data and Orthoimage , 2016, Remote. Sens..

[52]  Jingdong Wang,et al.  OCNet: Object Context Network for Scene Parsing , 2018, ArXiv.