A Comparative Study of Deep Learning Approaches to Rooftop Detection in Aerial Images

Abstract This paper investigates the deep neural networks for rapid and accurate detection of building rooftops in aerial orthoimages. The networks were trained using the manually labeled rooftop vector data digitized on aerial orthoimagery covering the Kitchener-Waterloo area. The performance of the three deep learning methods, U-Net, Fully Convolutional Network (FCN), and Deeplabv3+ were compared by training, validation, and testing sets in the dataset. Our results demonstrated that DeepLabv3+ achieved 63.8% in Intersection over Union (IoU), 77.8% in mean IoU (mIoU), 74% in precision, and 78% in F1-score. After improving the performance with focal loss, training loss was greatly cut down and the convergence rate experienced a significant growth. Meanwhile, rooftop detection also achieved higher performance, as Deeplabv3+ reached 93.6% in average pixel accuracy, with 65.4% in IoU, 79.0% in mIoU, 77.6% in precision, and 79.1% in F1-score. Lastly, in order to evaluate the effects of data volume, by changing data volume from 100% to 75% and 50% in ablation study, it shows that when data volume decreased, the performance of extraction also got worse, with IoU, mIoU, precision, and F1-score also mostly decreased.

[1]  Xie Wei-xin,et al.  A new method of building detection from a single aerial photograph , 2008, 2008 9th International Conference on Signal Processing.

[2]  Li Yong,et al.  ADAPTIVE BUILDING EDGE DETECTION BY COMBINING LIDAR DATA AND AERIAL IMAGES , 2008 .

[3]  Xiang Li,et al.  Building-A-Nets: Robust Building Extraction From High-Resolution Remote Sensing Images With Adversarial Networks , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[4]  Wei Lee Woon,et al.  Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks , 2017 .

[5]  Yao Zhao,et al.  De-biased dart ensemble model for personalized recommendation , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[6]  Yifan Wu,et al.  Aerial Imagery for Roof Segmentation: A Large-Scale Dataset towards Automatic Mapping of Buildings , 2018, ArXiv.

[7]  Dongbin Zhao,et al.  Graph-FCN for Image Semantic Segmentation , 2019, ISNN.

[8]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[9]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Thomas L. Burr Pattern Recognition and Machine Learning , 2008 .

[13]  Richard Kronland-Martinet,et al.  A real-time algorithm for signal analysis with the help of the wavelet transform , 1989 .

[14]  Tzu-Ping Lin,et al.  High Resolution Decision Maps for Urban Planning: A Combined Analysis of Urban Flooding and Thermal Stress Potential In Asia and Europe , 2016 .

[15]  Akira Iwasaki,et al.  The Effect of Focal Loss in Semantic Segmentation of High Resolution Aerial Image , 2018, IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium.

[16]  Meng Lu,et al.  Fully Convolutional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Olaf Ronneberger,et al.  Invited Talk: U-Net Convolutional Networks for Biomedical Image Segmentation , 2017, Bildverarbeitung für die Medizin.

[19]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[20]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[21]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Dinggang Shen,et al.  Difficulty-Aware Attention Network with Confidence Learning for Medical Image Segmentation , 2019, AAAI.

[24]  Jonathan Li,et al.  Toronto-3D: A Large-scale Mobile LiDAR Dataset for Semantic Segmentation of Urban Roadways , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[26]  Qizhi Xu,et al.  Building change detection for high-resolution remotely sensed images based on a semantic dependency , 2015, 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[27]  Yong Bai,et al.  Constrained-Focal-Loss Based Deep Learning for Segmentation of Spores , 2019, IEEE Access.

[28]  Geoffrey E. Hinton,et al.  Machine Learning for Aerial Image Labeling , 2013 .

[29]  Michele Volpi,et al.  Dense Semantic Labeling of Subdecimeter Resolution Images With Convolutional Neural Networks , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[30]  Parvaneh Saeedi,et al.  Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[31]  Chengdong Wu,et al.  Shadow-based Building Detection and Segmentation in High-resolution Remote Sensing Image , 2014, J. Multim..

[32]  Lianru Gao,et al.  Building Extraction from High-Resolution Aerial Imagery Using a Generative Adversarial Network with Spatial and Channel Attention Mechanisms , 2019, Remote. Sens..

[33]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[34]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[35]  Luca Maria Gambardella,et al.  Fast image scanning with deep max-pooling convolutional neural networks , 2013, 2013 IEEE International Conference on Image Processing.

[36]  Iasonas Kokkinos,et al.  Modeling local and global deformations in Deep Learning: Epitomic convolution, Multiple Instance Learning, and sliding window detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Yuan Wang,et al.  Focal Loss in 3D Object Detection , 2018, IEEE Robotics and Automation Letters.

[38]  C. Unsalan,et al.  Building detection from aerial images using invariant color features and shadow information , 2008, 2008 23rd International Symposium on Computer and Information Sciences.

[39]  Clive S. Fraser,et al.  Improved Building Detection Using Texture Information , 2013 .

[40]  Pushmeet Kohli,et al.  Associative hierarchical CRFs for object class image segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[41]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[42]  Jiangye Yuan,et al.  Building Extraction at Scale Using Convolutional Neural Network: Mapping of the United States , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[43]  Shaogang Gong,et al.  Imbalanced Deep Learning by Minority Class Incremental Rectification , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Qiming Qin,et al.  Effective Building Extraction From High-Resolution Remote Sensing Images With Multitask Driven Deep Neural Network , 2019, IEEE Geoscience and Remote Sensing Letters.

[46]  Ergong Zheng,et al.  Accurate road segmentation in remote sensing images using dense residual learning and improved focal loss , 2020 .

[47]  Hina Pande,et al.  Use of laser range and height texture cues for building identification , 2008 .

[48]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[49]  Guillaume Charpiat,et al.  Optimizing Partition Trees for Multi-Object Segmentation with Shape Prior , 2015, BMVC.

[50]  Long-term monitoring of the urban impervious surface mapping using time series Landsat imagery: A 23-year case study of the city of Wuhan in China , 2016, 2016 4th International Workshop on Earth Observation and Remote Sensing Applications (EORSA).

[51]  Yun Zhang,et al.  Optimisation of building detection in satellite images by combining multispectral classification and texture filtering , 1999 .

[52]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[53]  Andreas Dengel,et al.  Multi-Task Learning for Segmentation of Building Footprints with Deep Neural Networks , 2017, 2019 IEEE International Conference on Image Processing (ICIP).

[54]  Kuanquan Wang,et al.  Multi-Depth Fusion Network for Whole-Heart CT Image Segmentation , 2019, IEEE Access.

[55]  Giampaolo Ferraioli,et al.  Multichannel InSAR Building Edge Detection , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[56]  Ryosuke Shibasaki,et al.  Identification of Village Building via Google Earth Images and Supervised Machine Learning Methods , 2016, Remote. Sens..

[57]  Pierre Alliez,et al.  Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[58]  Motaz El-Saban,et al.  Automatic Pixelwise Object Labeling for Aerial Imagery Using Stacked U-Nets , 2018, ArXiv.