论文信息 - On the Robustness and Generalization Ability of Building Footprint Extraction on the Example of SegNet and Mask R-CNN

On the Robustness and Generalization Ability of Building Footprint Extraction on the Example of SegNet and Mask R-CNN

Building footprint (BFP) extraction focuses on the precise pixel-wise segmentation of buildings from aerial photographs such as satellite images. BFP extraction is an essential task in remote sensing and represents the foundation for many higher-level analysis tasks, such as disaster management, monitoring of city development, etc. Building footprint extraction is challenging because buildings can have different sizes, shapes, and appearances both in the same region and in different regions of the world. In addition, effects, such as occlusions, shadows, and bad lighting, have to also be considered and compensated. A rich body of work for BFP extraction has been presented in the literature, and promising research results have been reported on benchmarking datasets. Despite the comprehensive work performed, it is still unclear how robust and generalizable state-of-the-art methods are to different regions, cities, settlement structures, and densities. The purpose of this study is to close this gap by investigating questions on the practical applicability of BFP extraction. In particular, we evaluate the robustness and generalizability of state-of-the-art methods as well as their transfer learning capabilities. Therefore, we investigate in detail two of the most popular deep learning architectures for BFP extraction (i.e., SegNet, an encoder–decoder-based architecture and Mask R-CNN, an object detection architecture) and evaluate them with respect to different aspects on a proprietary high-resolution satellite image dataset as well as on publicly available datasets. Results show that both networks generalize well to new data, new cities, and across cities from different continents. They both benefit from increased training data, especially when this data is from the same distribution (data source) or of comparable resolution. Transfer learning from a data source with different recording parameters is not always beneficial.

M. Zeppelzauer | David Koch | Miroslav Despotovic | Muntaha Sakeena | Eric Stumpe

[1] J. Chedjou,et al. Improved Agricultural Field Segmentation in Satellite Imagery Using TL-ResUNet Architecture , 2022, Sensors.

[2] A. Nossum,et al. MapAI: Precision in Building Segmentation , 2022, Nordic Machine Intelligence.

[3] T. Alotaiby,et al. Detecting Buildings and Nonbuildings from Satellite Images Using U-Net , 2022, Computational intelligence and neuroscience.

[4] Sultan Daud Khan,et al. An Encoder–Decoder Deep Learning Framework for Building Footprints Extraction from Aerial Imagery , 2022, Arabian Journal for Science and Engineering.

[5] Andrew J Tatem,et al. High-resolution population estimation using household survey data and building footprints , 2021, Nature Communications.

[6] Ruigang Niu,et al. Hybrid Multiple Attention Network for Semantic Segmentation in Aerial Images , 2021, IEEE Transactions on Geoscience and Remote Sensing.

[7] S. Lang,et al. Mask R‐CNN‐based building extraction from VHR satellite data in operational humanitarian action: An example related to Covid‐19 response in Khartoum, Sudan , 2021, Trans. GIS.

[8] Cuizhen Wang,et al. Choosing an appropriate training set size when using existing data to train neural networks for land cover segmentation , 2020, Ann. GIS.

[9] Dongyi Wang,et al. Defect Detection of Industry Wood Veneer Based on NAS and Multi-Channel Mask R-CNN , 2020, Sensors.

[10] Mehdi P. Heris,et al. A rasterized building footprint dataset for the United States , 2020, Scientific Data.

[11] Pankaj Bodani,et al. Automatic building footprint extraction from very high-resolution imagery using deep learning techniques , 2020, Geocarto International.

[12] Muntaha Sakeena,et al. Real Estate Image Analysis: A Literature Review , 2020 .

[13] Weijia Li,et al. Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data , 2019, Remote. Sens..

[14] Wadii Boulila,et al. A top-down approach for semantic segmentation of big remote sensing images , 2019, Earth Science Informatics.

[15] Yifan Wu,et al. Aerial Imagery for Roof Segmentation: A Large-Scale Dataset towards Automatic Mapping of Buildings , 2018, ArXiv.

[16] Biswajeet Pradhan,et al. Deep Learning Approach for Building Detection Using LiDAR-Orthophoto Fusion , 2018, J. Sensors.

[17] Jiangye Yuan,et al. Building Extraction at Scale Using Convolutional Neural Network: Mapping of the United States , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[18] Uwe Stilla,et al. Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection , 2016, ISPRS Journal of Photogrammetry and Remote Sensing.

[19] Xiao Xiang Zhu,et al. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources , 2017, IEEE Geoscience and Remote Sensing Magazine.

[20] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21] Yang Shao,et al. Detection of Urban Damage Using Remote Sensing and Machine Learning Algorithms: Revisiting the 2010 Haiti Earthquake , 2016, Remote. Sens..

[22] Thomas Blaschke,et al. A systematic comparison of different object-based classification techniques using high spatial resolution imagery in agricultural environments , 2016, Int. J. Appl. Earth Obs. Geoinformation.

[23] Thomas Blaschke,et al. Building Extraction from Airborne Laser Scanning Data: An Analysis of the State of the Art , 2015, Remote. Sens..

[24] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.

[25] Alejandro F. Frangi,et al. Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015 , 2015, Lecture Notes in Computer Science.

[26] Els Ducheyne,et al. Fine-scale mapping of vector habitats using very high resolution satellite imagery: a liver fluke case-study. , 2014, Geospatial health.

[27] Wenzhong Shi,et al. Positional error modeling for line simplification based on automatic shape similarity analysis in GIS , 2006, Comput. Geosci..

[28] Yun Zhang,et al. Optimisation of building detection in satellite images by combining multispectral classification and texture filtering , 1999 .

[29] Dinesh Manocha,et al. Simplification envelopes , 1996, SIGGRAPH.

[30] David H. Douglas,et al. ALGORITHMS FOR THE REDUCTION OF THE NUMBER OF POINTS REQUIRED TO REPRESENT A DIGITIZED LINE OR ITS CARICATURE , 1973 .