A Multiple-Feature Reuse Network to Extract Buildings from Remote Sensing Imagery

Automatic building extraction from remote sensing imagery is important in many applications. The success of convolutional neural networks (CNNs) has also led to advances in using CNNs to extract man-made objects from high-resolution imagery. However, the large appearance and size variations of buildings make it difficult to extract both crowded small buildings and large buildings. High-resolution imagery must be segmented into patches for CNN models due to GPU memory limitations, and buildings are typically only partially contained in a single patch with little context information. To overcome the problems involved when using different levels of image features with common CNN models, this paper proposes a novel CNN architecture called a multiple-feature reuse network (MFRN) in which each layer is connected to all the subsequent layers of the same size, enabling the direct use of the hierarchical features in each layer. In addition, the model includes a smart decoder that enables precise localization with less GPU load. We tested our model on a large real-world remote sensing dataset and obtained an overall accuracy of 94.5% and an 85% F1 score, which outperformed the compared CNN models, including a 56-layer fully convolutional DenseNet with 93.8% overall accuracy and an F1 score of 83.5%. The experimental results indicate that the MFRN approach to connecting convolutional layers improves the performance of common CNN models for extracting buildings of different sizes and can achieve high accuracy with a consumer-level GPU.

[1]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Wei Lee Woon,et al.  Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks , 2017 .

[3]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[4]  Fumio Yamazaki,et al.  Novel Unsupervised Classification of Collapsed Buildings Using Satellite Imagery, Hazard Scenarios and Fragility Functions , 2018, Remote. Sens..

[5]  Lingfeng Wang,et al.  Semantic Labeling in Very High Resolution Images via a Self-Cascaded Convolutional Neural Network , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[6]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[7]  Jon Atli Benediktsson,et al.  Spectral–Spatial Classification of Hyperspectral Imagery Based on Partitional Clustering Techniques , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[8]  Bertrand Le Saux,et al.  Beyond RGB: Very High Resolution Urban Remote Sensing With Multimodal Deep Networks , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[9]  Peijun Du,et al.  A review of supervised object-based land-cover image classification , 2017 .

[10]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[11]  Yoshua Bengio,et al.  The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Prashanth Reddy Marpu,et al.  Object-based classification with features extracted by a semi-automatic feature extraction algorithm – SEaTH , 2011 .

[13]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Jon Atli Benediktsson,et al.  A novel hierarchical clustering technique based on splitting and merging , 2016 .

[15]  Yongyang Xu,et al.  Building Extraction in Very High Resolution Remote Sensing Imagery Using Deep Learning and Guided Filters , 2018, Remote. Sens..

[16]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[17]  Ian D. Reid,et al.  RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[21]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[22]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[23]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[25]  Xiangyun Hu,et al.  Object-Based Analysis of Airborne LiDAR Data for Building Change Detection , 2014, Remote. Sens..

[26]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Teng Long,et al.  Evaluation the performance of fully convolutional networks for building extraction compared with shallow models , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[28]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Rongjun Qin,et al.  Multi-level monitoring of subtle urban changes for the megacities of China using high-resolution multi-view satellite imagery , 2017 .

[30]  Yang Wang,et al.  Gated Feedback Refinement Network for Dense Image Labeling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Wei Yuan,et al.  Automatic Building Segmentation of Aerial Imagery Using Multi-Constraint Fully Convolutional Networks , 2018, Remote. Sens..

[32]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[33]  Geoffrey E. Hinton,et al.  Learning to Detect Roads in High-Resolution Aerial Images , 2010, ECCV.

[34]  X. Tong,et al.  Building-damage detection using pre- and post-seismic high-resolution satellite stereo imagery: A case study of the May 2008 Wenchuan earthquake , 2012 .

[35]  Jiangye Yuan,et al.  Automatic Building Extraction in Aerial Scenes Using Convolutional Networks , 2016, ArXiv.

[36]  Keisuke Nemoto,et al.  Effective Use of Dilated Convolutions for Segmenting Small Object Instances in Remote Sensing Imagery , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[37]  Jamie Sherrah,et al.  Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery , 2016, ArXiv.

[38]  Geoffrey E. Hinton,et al.  Machine Learning for Aerial Image Labeling , 2013 .

[39]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Pierre Alliez,et al.  High-Resolution Semantic Labeling with Convolutional Neural Networks , 2016 .

[41]  John Scott Bridle,et al.  Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.