A spatial ensemble approach for broad-area mapping of land surface properties

Abstract Understanding rapid global change requires land cover maps with broad spatial extent, but also fine spatial and temporal resolution. Developing such maps presents a unique challenge, as variability in relationships between spectral characteristics (i.e., predictors) and a response variable is likely to increase with the size of the region across which a model is built and applied. Although most mapping approaches apply the same predictor-response relationships globally across the entire modeling region, learned relationships from one local area may be invalid for another when predicting across broad extents. Here, we adapted a spatial ensemble approach borrowed from species distribution modeling to land cover mapping, and evaluated whether the approach could faithfully represent spatial variation in relationships between land cover and spectral data. The spatiotemporal exploratory model (STEM) uses an ensemble of regression trees defined within spatially overlapping support sets, producing a broad-extent map that reflects variability at the spatial scale of each constituent support set. As test cases for reference maps, we used 30-m-resolution forest canopy and impervious surface cover layers from the 2001 U.S. National Land Cover Database (NLCD) for the states of Washington, Oregon, and California. When testing strategies for support set size and sampling intensity, we found that predictor-response relationships were strongest when individual components of the spatial ensemble were small and when sampling intensity was high. Compared to aspatial bagged decision tree and random forest models, we found that the STEM approach successfully captured variation in our source maps, both globally and at scales smaller than the modeling region. Leveraging the spatial structure of a STEM, we also mapped per-pixel spatial variation in prediction confidence and the importance of different predictor variables. After testing appropriate spatial ensemble and sampling strategies, we extended the predictor-response relationships gleaned from the 2001 source maps into a yearly time series based on temporally-smoothed spectral data from the LandTrendr algorithm. The end products were yearly forest canopy and impervious surface cover time series representing 1990–2012. Formal evaluation showed that our temporally extended maps also closely resembled NLCD maps from 2011. The aim of this research was to cultivate the implicit relationships between spectral data and a given map, not improve them, but as the need for time series maps produced at both broad extents and fine resolutions increases, our results demonstrate that an ensemble of locally defined estimators is potentially more appropriate than conventional ensemble models for land cover mapping across broad extents.

[1]  D. Roy,et al.  Web-enabled Landsat Data (WELD): Landsat ETM+ composited mosaics of the conterminous United States , 2010 .

[2]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[3]  Christopher A. Barnes,et al.  Completion of the 2006 National Land Cover Database for the conterminous United States. , 2011 .

[4]  David J. Nowak,et al.  Evaluating The National Land Cover Database Tree Canopy and Impervious Cover Estimates Across the Conterminous United States: A Comparison with Photo-Interpreted Estimates , 2010, Environmental management.

[5]  Giles M. Foody,et al.  Harshness in image classification accuracy assessment , 2008 .

[6]  Limin Yang,et al.  A STRATEGY FOR ESTIMATING TREE CANOPY DENSITY USING LANDSAT 7 ETM+ AND HIGH RESOLUTION IMAGES OVER LARGE AREAS , 2001 .

[7]  Sean P. Healey,et al.  Application of two regression-based methods to estimate the effects of partial harvest on forest structure using Landsat data , 2006 .

[8]  Michael Dixon,et al.  Google Earth Engine: Planetary-scale geospatial analysis for everyone , 2017 .

[9]  Terry L Sohl,et al.  Using an Ecoregion Framework to Analyze Land-Cover and Land-Use Dynamics , 2004, Environmental management.

[10]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[11]  Jeffrey T. Walton,et al.  Assessment of 2001 NLCD Percent Tree and Impervious Cover Estimates , 2009 .

[12]  飯塚 寛,et al.  Aspect transformation in site productivity research , 1967 .

[13]  Zhe Zhu,et al.  Change detection using landsat time series: A review of frequencies, preprocessing, algorithms, and applications , 2017 .

[14]  Joanne C. White,et al.  Pixel-Based Image Compositing for Large-Area Dense Time Series Applications and Science , 2014 .

[15]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[16]  Alan M. MacEachren,et al.  VISUALIZING UNCERTAIN INFORMATION , 1992 .

[17]  C. Woodcock,et al.  Continuous change detection and classification of land cover using all available Landsat data , 2014 .

[18]  Sarah Parks,et al.  An effective assessment protocol for continuous geospatial datasets of forest characteristics using USFS Forest Inventory and Analysis (FIA) data , 2010 .

[19]  C. Justice,et al.  High-Resolution Global Maps of 21st-Century Forest Cover Change , 2013, Science.

[20]  Galen Maclaurin,et al.  Temporal replication of the national land cover database using active machine learning , 2016 .

[21]  J. Wickham,et al.  Completion of the 2001 National Land Cover Database for the conterminous United States , 2007 .

[22]  Michael J. Ford,et al.  Trends in Developed Land Cover Adjacent to Habitat for Threatened Salmon in Puget Sound, Washington, U.S.A. , 2015, PloS one.

[23]  Zhiqiang Yang,et al.  Detecting trends in forest disturbance and recovery using yearly Landsat time series: 1. LandTrendr — Temporal segmentation algorithms , 2010 .

[24]  T. Loveland,et al.  The characteristics and interpretability of land surface change and implications for project design , 2004 .

[25]  R. Pontius,et al.  Death to Kappa: birth of quantity disagreement and allocation disagreement for accuracy assessment , 2011 .

[26]  Giles M. Foody,et al.  Status of land cover classification accuracy assessment , 2002 .

[27]  Douglas J. Norton,et al.  Spatial patterns of watershed impervious cover relative to stream location , 2014 .

[28]  D. King,et al.  Mapping forest growth and decline in a temperate mixed forest using temporal trend analysis of Landsat imagery, 1987–2010 , 2014 .

[29]  Yeqiao Wang,et al.  Remote sensing change detection tools for natural resource managers: Understanding concepts and tradeoffs in the design of landscape monitoring projects , 2009 .

[30]  J. Townshend,et al.  Global, Landsat-based forest-cover change from 1990 to 2000 , 2014 .

[31]  C. T. Dyrness,et al.  Natural Vegetation of Oregon and Washington , 1988 .

[32]  W. Cohen,et al.  Landsat's Role in Ecological Applications of Remote Sensing , 2004 .

[33]  Michael A. Wulder,et al.  Opening the archive: How free data has enabled the science and monitoring promise of Landsat , 2012 .

[34]  Warren B. Cohen,et al.  Choosing appropriate subpopulations for modeling tree canopy cover nationwide , 2012 .

[35]  Christopher A. Lepczyk,et al.  Associations of forest bird species richness with housing and landscape patterns across the USA. , 2007, Ecological applications : a publication of the Ecological Society of America.

[36]  Chris E. Jordan,et al.  Attribution of disturbance change agent from Landsat time-series in support of habitat monitoring in the Puget Sound region, USA , 2015 .

[37]  A. Lister,et al.  A nearest-neighbor imputation approach to mapping tree species over large areas using forest inventory plots and moderate resolution raster data , 2012 .

[38]  J. Pekel,et al.  High-resolution mapping of global surface water and its long-term changes , 2016, Nature.

[39]  Giles M. Foody,et al.  Good practices for estimating area and assessing accuracy of land change , 2014 .

[40]  Le Yu,et al.  Mapping global land cover in 2001 and 2010 with spatial-temporal consistency at 250 m resolution , 2015 .

[41]  Warren B. Cohen,et al.  An empirical, integrated forest biomass monitoring system , 2018 .

[42]  Giles M. Foody,et al.  Training set size requirements for the classification of a specific class , 2006 .

[43]  Michael J. Oimoen,et al.  The National Elevation Dataset , 2002 .

[44]  James A. Westfall,et al.  NACP Aboveground Biomass and Carbon Baseline Data, V.2 (NBCD 2000), U.S.A., 2000 , 2013 .

[45]  S. Goetz,et al.  Watersheds at Risk to Increased Impervious Surface Cover in the Conterminous United States , 2009 .

[46]  W. Cohen,et al.  Spatial and temporal patterns of forest disturbance and regrowth within the area of the Northwest Forest Plan , 2012 .

[47]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[48]  W. Cohen,et al.  Estimating the age and structure of forests in a multi-ownership landscape of western Oregon, U.S.A. , 1995 .

[49]  Roberta E. Martin,et al.  A Tale of Two “Forests”: Random Forest Machine Learning Aids Tropical Forest Carbon Mapping , 2014, PloS one.

[50]  D. Fink,et al.  Spatiotemporal exploratory models for broad-scale survey data. , 2010, Ecological applications : a publication of the Ecological Society of America.

[51]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[52]  Limin Yang,et al.  Development of a 2001 National land-cover database for the United States , 2004 .

[53]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[54]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[55]  Warren B. Cohen,et al.  Modeling Percent Tree Canopy Cover: A Pilot Study , 2012 .

[56]  Conghe Song,et al.  Consistent classification of image time series with automatic adaptive signature generalization , 2013 .

[57]  Eric P. Crist,et al.  A Physically-Based Transformation of Thematic Mapper Data---The TM Tasseled Cap , 1984, IEEE Transactions on Geoscience and Remote Sensing.

[58]  Chengquan Huang,et al.  Global, 30-m resolution continuous fields of tree cover: Landsat-based rescaling of MODIS vegetation continuous fields with lidar-based estimates of error , 2013, Int. J. Digit. Earth.