Multistrategy ensemble regression for mapping of built-up density and height with Sentinel-2 data

Abstract In this paper, we establish a workflow for estimation of built-up density and height based on multispectral Sentinel-2 data. To do so, we render the estimation of built-up density and height as a supervised learning problem. Given the rational level of measurement of those two target variables, the regression estimation problem is regarded as finding the mapping between an incoming vector, i.e., ubiquitously available features computed from Sentinel-2 data, and an observable output (i.e., training set), which is derived over spatially limited areas in an automated manner. As such, training sets are automatically generated from a joint exploitation of TanDEM-X mission elevation data and Sentinel-2 imagery, and, as an alternative, from cadastral sources. The training sets are used to regress the target variables for spatial processing units which correspond to urban neighborhood scales. From a methodological point of view, we introduce a novel ensemble regression approach, i.e., multistrategy ensemble regression (MSER), based on advanced machine learning-based regression algorithms including Random Forest Regression, Support Vector Regression, Gaussian Process Regression, and Neural Network Regression. To establish a robust ensemble, those algorithms are learned with a modified version of the AdaBoost.RT algorithm. However, to reliably ensure diversity between single boosted regressors, we include a random feature subspace method in the procedure. In contrast to existing approaches, we selectively prune non-favorable regressors trained during the boosting procedure and calculate the final prediction by a weighted mean function on the residual models to ensure enhanced accuracy properties of predictions. Finally, outputs are concatenated into a single prediction with a decision fusion strategy. Experimental results are obtained from four test areas which cover the settlement areas of the four largest German cites, i.e., Berlin, Hamburg, Munich, and Cologne. The results unambiguously underline the beneficial properties of the MSER approach, since all best predictions were obtained with a boosted regressor in conjunction with a decision fusion strategy in a comparative setup. The mean absolute errors of corresponding models vary between 3 and 16% and 1–5.4 m with respect to built-up density and height, respectively, depending on the validation strategy, size of the spatial processing units, and test area. Also in a domain adaptation setup (i.e., when learning a model over a source domain and applying it over a geographically different target domain) numerous predictions show comparable accuracy levels as predictions obtained within a source domain. This further underlines the viability to transfer a model and, thus, enable a substitution of the training data in the target domains.

[1]  Martino Pesaresi,et al.  A Robust Built-Up Area Presence Index by Anisotropic Rotation-Invariant Textural Measure , 2008, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[2]  Martino Pesaresi,et al.  Improved Textural Built-Up Presence Index for Automatic Recognition of Human Settlements in Arid Regions With Scattered Vegetation , 2011, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[3]  Fabio Roli,et al.  Dynamic classifier selection based on multiple classifier behaviour , 2001, Pattern Recognit..

[4]  Jon Atli Benediktsson,et al.  A new approach for the morphological segmentation of high-resolution satellite imagery , 2001, IEEE Trans. Geosci. Remote. Sens..

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Hannes Taubenböck,et al.  The Physical Density of the City - Deconstruction of the Delusive Density Measure with Evidence from Two European Megacities , 2016, ISPRS Int. J. Geo Inf..

[7]  Michael Dixon,et al.  Google Earth Engine: Planetary-scale geospatial analysis for everyone , 2017 .

[8]  Luis Alonso,et al.  Machine learning regression algorithms for biophysical parameter retrieval: Opportunities for Sentinel-2 and -3 , 2012 .

[9]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[10]  Hannes Taubenböck,et al.  Estimation of Seismic Vulnerability Levels of Urban Structures With Multisensor Remote Sensing , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[11]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[12]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Xi Chen,et al.  Supervised Multiview Feature Selection Exploring Homogeneity and Heterogeneity With $\ell_{1,2}$ -Norm and Automatic View Generation , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Matthias Drusch,et al.  Sentinel-2: ESA's Optical High-Resolution Mission for GMES Operational Services , 2012 .

[15]  José Manuel Benítez,et al.  Neural Networks in R Using the Stuttgart Neural Network Simulator: RSNNS , 2012 .

[16]  Thomas Blaschke,et al.  Remote Sensing-Based Characterization of Settlement Structures for Assessing Local Potential of District Heat , 2011, Remote. Sens..

[17]  Liangpei Zhang,et al.  Morphological Building/Shadow Index for Building Extraction From High-Resolution Imagery Over Urban Areas , 2012, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[18]  Qian Du,et al.  Multi-Modal and Multi-Temporal Data Fusion: Outcome of the 2012 GRSS Data Fusion Contest , 2013, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[19]  Hannes Taubenböck,et al.  Multitask Active Learning for Characterization of Built Environments With Multisensor Earth Observation Data , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[20]  Shiliang Sun,et al.  Multi-view learning overview: Recent progress and new challenges , 2017, Inf. Fusion.

[21]  Wei Zhang,et al.  Multiple Classifier System for Remote Sensing Image Classification: A Review , 2012, Sensors.

[22]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[23]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[24]  Gerhard Krieger,et al.  TanDEM-X: The New Global DEM Takes Shape , 2014, IEEE Geoscience and Remote Sensing Magazine.

[25]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[26]  Tao Zhang,et al.  Urban Building Density Estimation From High-Resolution Imagery Using Multiple Features and Support Vector Regression , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[27]  Hannes Taubenböck,et al.  Investigating the Applicability of Cartosat-1 DEMs and Topographic Maps to Localize Large-Area Urban Mass Concentrations , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[28]  Massimiliano Pittore,et al.  Perspectives on global dynamic exposure modelling for geo-risk assessment , 2017, Natural Hazards.

[29]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[30]  Geoffrey I. Webb,et al.  Multistrategy ensemble learning: reducing error by combining ensemble learning techniques , 2004, IEEE Transactions on Knowledge and Data Engineering.

[31]  William J. Emery,et al.  A neural network approach using multi-scale textural metrics from very high-resolution panchromatic imagery for urban land-use classification , 2009 .

[32]  C. Ratti,et al.  Energy consumption and urban texture , 2005 .

[33]  Pierre Soille,et al.  Morphological Image Analysis: Principles and Applications , 2003 .

[34]  Alípio Mário Jorge,et al.  Ensemble approaches for regression: A survey , 2012, CSUR.

[35]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[36]  Jonathan D. Cohen,et al.  Level of Detail for 3D Graphics , 2012 .

[37]  R.M. Haralick,et al.  Statistical and structural approaches to texture , 1979, Proceedings of the IEEE.

[38]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[39]  Thomas Kemper,et al.  Automated metric characterization of urban structure using building decomposition from very high resolution imagery , 2015, Int. J. Appl. Earth Obs. Geoinformation.

[40]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[41]  Rongjun Qin,et al.  Multi-level monitoring of subtle urban changes for the megacities of China using high-resolution multi-view satellite imagery , 2017 .

[42]  André Stumpf,et al.  Object-oriented mapping of landslides using Random Forests , 2011 .

[43]  Jin Liu,et al.  Extraction of High-Precision Urban Impervious Surfaces from Sentinel-2 Multispectral Imagery via Modified Linear Spectral Mixture Analysis , 2018, Sensors.

[44]  Chun Liu,et al.  Automatic extraction of built-up area from ZY3 multi-view satellite imagery: Analysis of 45 global cities , 2019, Remote Sensing of Environment.

[45]  Hannes Taubenböck,et al.  On the Effect of Spatially Non-Disjoint Training and Test Samples on Estimated Model Generalization Capabilities in Supervised Classification With Spatial Features , 2017, IEEE Geoscience and Remote Sensing Letters.

[46]  T. Esch,et al.  Settlement detection and impervious surface estimation in the Mekong Delta using optical and SAR remote sensing data , 2011 .

[47]  David H. Wolpert,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[48]  Roberta E. Martin,et al.  Multi-method ensemble selection of spectral bands related to leaf biochemistry , 2015 .

[49]  Hannes Taubenböck,et al.  Object-Based Morphological Profiles for Classification of Remote Sensing Imagery , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[50]  Lorenzo Bruzzone,et al.  Robust multiple estimator systems for the analysis of biophysical parameters from remotely sensed data , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[51]  Gerhard Krieger,et al.  Generation and performance assessment of the global TanDEM-X digital elevation model , 2017 .

[52]  Mohsen Azadbakht,et al.  Machine Learning Regression Techniques for the Silage Maize Yield Prediction Using Time-Series Images of Landsat 8 OLI , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[53]  Lorenzo Bruzzone,et al.  Domain Adaptation for the Classification of Remote Sensing Data: An Overview of Recent Advances , 2016, IEEE Geoscience and Remote Sensing Magazine.

[54]  T. Esch,et al.  Urban structure type characterization using hyperspectral remote sensing and height information , 2012 .

[55]  Ian H. Witten,et al.  Issues in Stacked Generalization , 2011, J. Artif. Intell. Res..

[56]  Gerhard Krieger,et al.  TanDEM-X: A Satellite Formation for High-Resolution SAR Interferometry , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[57]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[58]  G. Camps-Valls,et al.  A Survey on Gaussian Processes for Earth-Observation Data Analysis: A Comprehensive Investigation , 2016, IEEE Geoscience and Remote Sensing Magazine.

[59]  Achim Roth,et al.  Accuracy assessment of the global TanDEM-X Digital Elevation Model with GPS data , 2018 .

[60]  T. Esch,et al.  Breaking new ground in mapping human settlements from space – The Global Urban Footprint , 2017, 1706.04862.

[61]  Hannes Taubenböck,et al.  Estimation of seismic building structural types using multi-sensor remote sensing and machine learning techniques , 2015 .

[62]  Xin Huang,et al.  A multi-index learning approach for classification of high-resolution remotely sensed images over urban areas , 2014 .

[63]  Jian Ma,et al.  Study of corporate credit risk prediction based on integrating boosting and random subspace , 2011, Expert Syst. Appl..

[64]  Hannes Taubenböck,et al.  Normalization of TanDEM-X DSM Data in Urban Environments With Morphological Filters , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[65]  Nicolás García-Pedrajas,et al.  Boosting random subspace method , 2008, Neural Networks.

[66]  Patrick Hostert,et al.  Ensemble Learning From Synthetically Mixed Training Data for Quantifying Urban Land Cover With Support Vector Regression , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[67]  David Hernández-López,et al.  Automated Urban Analysis Based on LiDAR-Derived Building Models , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[68]  D.P. Solomatine,et al.  AdaBoost.RT: a boosting algorithm for regression problems , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[69]  Christophe Sannier,et al.  Monitoring Urban Areas with Sentinel-2A Data: Application to the Update of the Copernicus High Resolution Layer Imperviousness Degree , 2016, Remote. Sens..

[70]  Hannes Taubenböck,et al.  Remote sensing contributing to assess earthquake risk: from a literature review towards a roadmap , 2013, Natural Hazards.

[71]  Jianping Wu,et al.  Automated derivation of urban building density information using airborne LiDAR data and object-based method , 2010 .

[72]  Homayoun Najjaran,et al.  Adaboost.MRT: Boosting regression for multivariate estimation , 2014, Artif. Intell. Res..

[73]  T. Esch,et al.  Monitoring urbanization in mega cities from space , 2012 .

[74]  T. Esch,et al.  Large-area assessment of impervious surface based on integrated analysis of single-date Landsat-7 images and geospatial vector data , 2009 .

[75]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[76]  Xiao Xiang Zhu,et al.  Large-Area Characterization of Urban Morphology—Mapping of Built-Up Height and Density Using TanDEM-X and Sentinel-2 Data , 2019, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[77]  Jianya Gong,et al.  Angular difference feature extraction for urban scene classification using ZY-3 multi-angle high-resolution satellite imagery , 2018 .