Deep Learning Models to Count Buildings in High-Resolution Overhead Images

This paper addresses the problem of counting buildings in very high-resolution overhead true color imagery. We study and discuss the relevance of deep-learning based methods to this task. Two architectures and two loss functions are proposed and compared. We show that a model enforcing equivariance to rotations is beneficial for the task of counting in remotely sensed images. We also highlight the importance of robustness to outliers of the loss function when considering remote sensing applications.

[1]  Haroon Idrees,et al.  Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.

[2]  Jordan M. Malof,et al.  Large-Scale Semantic Classification: Outcome of the First Year of Inria Aerial Image Labeling Benchmark , 2018, IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium.

[3]  S. Tsaftaris,et al.  Pheno‐Deep Counter: a unified and versatile deep learning architecture for leaf counting , 2018, The Plant journal : for cell and molecular biology.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jiangye Yuan,et al.  Learning to count buildings in diverse aerial scenes , 2014, SIGSPATIAL/GIS.

[6]  Xiao Xiang Zhu,et al.  Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources , 2017, IEEE Geoscience and Remote Sensing Magazine.

[7]  Farid Melgani,et al.  Automatic Car Counting Method for Unmanned Aerial Vehicle Images , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[8]  Ratko Pilipovic,et al.  Evaluation of convnets for large-scale scene classification from high-resolution remote sensing images , 2017, IEEE EUROCON 2017 -17th International Conference on Smart Technologies.

[9]  Lu Yang,et al.  Semantic Segmentation for High Spatial Resolution Remote Sensing Images Based on Convolution Neural Network and Pyramid Pooling Module , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[12]  Vishal M. Patel,et al.  A Survey of Recent Advances in CNN-based Single Image Crowd Counting and Density Estimation , 2017, Pattern Recognit. Lett..

[13]  Pierre Alliez,et al.  Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[14]  Mark W. Schmidt,et al.  Where are the Blobs: Counting by Localization with Point Supervision , 2018, ECCV.

[15]  Andrew Zisserman,et al.  Counting in the Wild , 2016, ECCV.

[16]  Winston H. Hsu,et al.  Drone-Based Object Counting by Spatially Regularized Regional Proposal Network , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Nikos Komodakis,et al.  Rotation Equivariant Vector Field Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Michele Volpi,et al.  Deep multi-task learning for a geographically-regularized semantic segmentation of aerial images , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[19]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[20]  Wesam A. Sakla,et al.  A Large Contextual Dataset for Classification, Detection and Counting of Cars with Deep Learning , 2016, ECCV.

[21]  Michele Volpi,et al.  Land cover mapping at very high resolution with rotation equivariant CNNs: towards small yet accurate models , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[22]  Serge J. Belongie,et al.  Residual Networks Behave Like Ensembles of Relatively Shallow Networks , 2016, NIPS.