Learning to extract buildings from ultra-high-resolution drone images and noisy labels

ABSTRACT Building maps have a plethora of applications in government, industry and academia. In most cases, large scale maps can be retrieved from OpenStreetMap vector data. However, for certain rapidly changing built and semi-built environments, corresponding maps are not as accurate and contain label noise such as missing, incorrectly present, shifted labels, etc.; mainly because buildings in those regions are constantly being constructed, deconstructed, replaced and altered. One such case is extant in the Rohingya camps of southeastern border region of Bangladesh. Mass refugee influx in late 2017 and following population growth has necessitated the construction of buildings and expansion of camps. Consequently, reliable methods are necessary for detecting and documenting camp buildings. Ultra-high-resolution drone images of Rohingya camps are semantically segmented through fully convolutional U-Net deep learning systems for generating accurate building maps from noisy labels. A wide variety of noises are prevalent in the labels. Deep learning systems provide less noisy predictions compared to the classification tool in the most widely used Geographic Information System (GIS) software, ArcGIS. Data augmentation and regularization allows reliable learning, even in the presence of label noise. During testing, calculation of numeric performance metrics against noisy labels can grossly underestimate true skill and performance of the model. A subset of 22 million pixels of the testing data is relabelled by hand to obtain noise-free labels. Testing our generated maps against noisy and noise-free labels confirms that true performance is higher than otherwise indicated by freely available building maps. Empirical results reveal that utilized pipeline is able to learn from noisy data and produce labels which are more accurate and less noisy. Labels generated by our best performing system provide Intersection-over-Union (IoU) gain of 17.6% and Dice score gain of 13.6% over freely available labels from OpenStreetMap. Finally, spatio-temporal building maps are generated to portray the applicability of this research.

[1]  Adnan Firoze,et al.  Machine learning for predicting landslide risk of Rohingya refugee camp infrastructure , 2020, J. Inf. Telecommun..

[2]  Sim Heng Ong,et al.  Dual-Resolution U-Net: Building Extraction from Aerial Images , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[3]  Munshi Khaledur Rahman,et al.  Rohingya Refugee Crisis and Forest Cover Change in Teknaf, Bangladesh , 2018, Remote. Sens..

[4]  A. Dewan,et al.  Land use and land cover change in Greater Dhaka, Bangladesh: Using remote sensing to promote sustainable urbanization , 2009 .

[5]  E. F. Filho,et al.  Comparison between artificial neural networks and maximum likelihood classification in digital soil mapping , 2013 .

[6]  Nedret Billor,et al.  Comparison of standard maximum likelihood classification and polytomous logistic regression used in remote sensing , 2013 .

[7]  Stefanie Hellweg,et al.  Tracking Construction Material over Space and Time: Prospective and Geo‐referenced Modeling of Building Stocks and Construction Material Flows , 2019 .

[8]  M. Jashimuddin,et al.  Monitoring dynamic land-use change in rural–urban transition: a case study from Hathazari Upazila, Bangladesh , 2018, Geology, Ecology, and Landscapes.

[9]  Jiangye Yuan,et al.  Learning Building Extraction in Aerial Scenes with Convolutional Networks , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Colin M. Rose,et al.  Quantification of material stocks in existing buildings using secondary data—A case study for timber in a London Borough , 2019 .

[11]  Atul K. Jain,et al.  Field-experiment constraints on the enhancement of the terrestrial carbon sink by CO2 fertilization , 2019, Nature Geoscience.

[12]  Bruce A. Draper,et al.  Efficient Label Collection for Image Datasets via Hierarchical Clustering , 2017, International Journal of Computer Vision.

[13]  Rashedur M. Rahman,et al.  Artificial Neural Network and Machine Learning Based Methods for Population Estimation of Rohingya Refugees: Comparing Data-Driven and Satellite Image-Driven Approaches , 2019, Vietnam. J. Comput. Sci..

[14]  Devis Tuia,et al.  Understanding urban landuse from the above and ground perspectives: a deep learning, multimodal solution , 2019, Remote Sensing of Environment.

[15]  B. Ahmed,et al.  Application of geospatial technologies in developing a dynamic landslide early warning system in a humanitarian context: the Rohingya refugee crisis in Cox’s Bazar, Bangladesh , 2020, Geomatics, Natural Hazards and Risk.

[16]  Qingqing Huang,et al.  A Comparative Study of U-Nets with Various Convolution Components for Building Extraction , 2019, 2019 Joint Urban Remote Sensing Event (JURSE).

[17]  Yongyang Xu,et al.  Building Extraction in Very High Resolution Remote Sensing Imagery Using Deep Learning and Guided Filters , 2018, Remote. Sens..

[18]  N. Campbell,et al.  Derivation and applications of probabilistic measures of class membership from the maximum-likelihood classification , 1992 .

[19]  Riasad Bin Mahbub,et al.  Human appropriation of net primary production in Bangladesh, 1700–2100 , 2019, Land Use Policy.

[20]  Peter M. Atkinson,et al.  Scale Sequence Joint Deep Learning (SS-JDL) for land use and land cover classification , 2020, Remote Sensing of Environment.

[21]  Tamanna Motahar,et al.  Understanding the political ecology of forced migration and deforestation through a multi-algorithm classification approach: the case of Rohingya displacement in the southeastern border region of Bangladesh , 2019, Geology, Ecology, and Landscapes.

[22]  Jun Hee Kim,et al.  Objects Segmentation From High-Resolution Aerial Images Using U-Net With Pyramid Pooling Layers , 2019, IEEE Geoscience and Remote Sensing Letters.

[23]  Sang Michael Xie,et al.  Combining satellite imagery and machine learning to predict poverty , 2016, Science.

[24]  M. Jashimuddin,et al.  Analyzing multi-temporal satellite imagery and stakeholders' perceptions to have an insight into how forest co-management is changing the protected area landscapes in Bangladesh , 2019, Forest Policy and Economics.

[25]  Alan H. Strahler,et al.  The Use of Prior Probabilities in Maximum Likelihood Classification , 1980 .

[26]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[27]  Geoffrey E. Hinton,et al.  Learning to Detect Roads in High-Resolution Aerial Images , 2010, ECCV.

[28]  Meng Lu,et al.  A scale robust convolutional neural network for automatic building extraction from aerial and satellite imagery , 2018, International Journal of Remote Sensing.

[29]  Shunta Saito,et al.  Building and road detection from large aerial imagery , 2015, Electronic Imaging.