Dual-Resolution U-Net: Building Extraction from Aerial Images

Deep learning has been applied to segment buildings from high-resolution images with promising results. However, there still exist the problems stemming from training on split patches and class imbalances. To overcome these problems, we propose a dual-resolution U-Net that uses pairs of images as inputs to capture both high and low resolution features. We also employ a soft Jaccard loss to place more emphasis on the sparse and low accuracy samples. The images from different regions are further balanced according to their building densities. With our architecture, we achieved state-of-the-art results on the Inria aerial image labeling dataset without any post-processing.

[1]  Guoying Zhao,et al.  SRN: Side-Output Residual Network for Object Symmetry Detection in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Konstantinos Kamnitsas,et al.  Efficient multi‐scale 3D CNN with fully connected CRF for accurate brain lesion segmentation , 2016, Medical Image Anal..

[4]  Wei Liu,et al.  ParseNet: Looking Wider to See Better , 2015, ArXiv.

[5]  Pierre Alliez,et al.  Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[6]  Yang Wang,et al.  Optimizing Intersection-Over-Union in Deep Neural Networks for Image Segmentation , 2016, ISVC.

[7]  Shiyong Cui,et al.  BUILDING EXTRACTION FROM REMOTE SENSING DATA USING FULLY CONVOLUTIONAL NETWORKS , 2017 .

[8]  Junwei Han,et al.  A Survey on Object Detection in Optical Remote Sensing Images , 2016, ArXiv.

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Yi Yang,et al.  Attention to Scale: Scale-Aware Semantic Image Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Geoffrey E. Hinton,et al.  Machine Learning for Aerial Image Labeling , 2013 .

[12]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[15]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[16]  Raquel Urtasun,et al.  DeepRoadMapper: Extracting Road Topology from Aerial Images , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).