A Tale of Two “Forests”: Random Forest Machine Learning Aids Tropical Forest Carbon Mapping

Accurate and spatially-explicit maps of tropical forest carbon stocks are needed to implement carbon offset mechanisms such as REDD+ (Reduced Deforestation and Degradation Plus). The Random Forest machine learning algorithm may aid carbon mapping applications using remotely-sensed data. However, Random Forest has never been compared to traditional and potentially more reliable techniques such as regionally stratified sampling and upscaling, and it has rarely been employed with spatial data. Here, we evaluated the performance of Random Forest in upscaling airborne LiDAR (Light Detection and Ranging)-based carbon estimates compared to the stratification approach over a 16-million hectare focal area of the Western Amazon. We considered two runs of Random Forest, both with and without spatial contextual modeling by including—in the latter case—x, and y position directly in the model. In each case, we set aside 8 million hectares (i.e., half of the focal area) for validation; this rigorous test of Random Forest went above and beyond the internal validation normally compiled by the algorithm (i.e., called “out-of-bag”), which proved insufficient for this spatial application. In this heterogeneous region of Northern Peru, the model with spatial context was the best preforming run of Random Forest, and explained 59% of LiDAR-based carbon estimates within the validation area, compared to 37% for stratification or 43% by Random Forest without spatial context. With the 60% improvement in explained variation, RMSE against validation LiDAR samples improved from 33 to 26 Mg C ha−1 when using Random Forest with spatial context. Our results suggest that spatial context should be considered when using Random Forest, and that doing so may result in substantially improved carbon stock modeling for purposes of climate change mitigation.

[1]  G. Asner,et al.  Mapping tropical forest carbon: Calibrating plot estimates to a simple LiDAR metric , 2014 .

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  Roberta E. Martin,et al.  High-fidelity national carbon mapping for resource management and REDD+ , 2013, Carbon Balance and Management.

[4]  G. Asner,et al.  Harvesting tree biomass at the stand level to assess the accuracy of field and airborne biomass estimation in savannas. , 2013, Ecological applications : a publication of the Ecological Society of America.

[5]  Christopher B Field,et al.  Environmental and community controls on plant canopy chemistry in a Mediterranean-type ecosystem , 2013, Proceedings of the National Academy of Sciences.

[6]  C. Woodall,et al.  Imputing forest carbon stock estimates from inventory plots to a nationally continuous coverage , 2013, Carbon Balance and Management.

[7]  Roberta E. Martin,et al.  Carnegie Airborne Observatory-2: Increasing science data dimensionality via high-fidelity multi-sensor fusion , 2012 .

[8]  Gregory Asner,et al.  Use of Landsat and SRTM Data to Detect Broad-Scale Biodiversity Patterns in Northwestern Amazonia , 2012, Remote. Sens..

[9]  Göran Ståhl,et al.  Assessing the accuracy of regional LiDAR-based biomass estimation using a simulation approach , 2012 .

[10]  J. Eitel,et al.  Quantifying aboveground forest carbon pools and fluxes from repeat LiDAR surveys , 2012 .

[11]  G. Asner,et al.  High-resolution mapping of forest carbon stocks in the Colombian Amazon , 2012 .

[12]  S. Goetz,et al.  Estimated carbon dioxide emissions from tropical deforestation improved by carbon-density maps , 2012 .

[13]  Ghislain Vieilledent,et al.  Human and environmental controls over aboveground carbon storage in Madagascar , 2012, Carbon Balance and Management.

[14]  C. Field,et al.  Environmental filtering and land-use history drive patterns in biomass accumulation in a mediterranean-type landscape. , 2012, Ecological applications : a publication of the Ecological Society of America.

[15]  G. Asner,et al.  A universal airborne LiDAR approach for tropical forest carbon mapping , 2011, Oecologia.

[16]  Kalle Ruokolainen,et al.  Geological control of floristic composition in Amazonian forests , 2011, Journal of biogeography.

[17]  David E. Knapp,et al.  High-resolution carbon mapping on the million-hectare Island of Hawaii , 2011 .

[18]  Sean C. Thomas,et al.  A Reassessment of Carbon Content in Tropical Trees , 2011, PloS one.

[19]  Luc Van Gool,et al.  Real time head pose estimation with random regression forests , 2011, CVPR 2011.

[20]  W. Salas,et al.  Benchmark map of forest carbon stocks in tropical regions across three continents , 2011, Proceedings of the National Academy of Sciences.

[21]  P. Atkinson,et al.  Incorporating Spatial Variability Measures in Land-cover Classification using Random Forest , 2011 .

[22]  Y. Wiersma,et al.  Predictive species and habitat modeling in landscape ecology : concepts and applications , 2011 .

[23]  J. Evans,et al.  Modeling Species Distribution and Change Using Random Forest , 2011 .

[24]  M. Herold,et al.  Monitoring, reporting and verification for national REDD + programmes: two proposals , 2011 .

[25]  Gregory P. Asner,et al.  Controls over aboveground forest carbon density on Barro Colorado Island, Panama , 2010 .

[26]  Wolfram Burgard,et al.  Robotics: Science and Systems XV , 2010 .

[27]  G. Powell,et al.  High-resolution forest carbon stocks and emissions in the Amazon , 2010, Proceedings of the National Academy of Sciences.

[28]  Colin M Beale,et al.  Regression analysis of spatial data. , 2010, Ecology letters.

[29]  Rudolph Triebel,et al.  Introspective Active Learning for Scalable Semantic Mapping , 2010 .

[30]  Gregory P. Asner,et al.  Tropical forest carbon assessment: integrating satellite and airborne mapping approaches , 2009 .

[31]  David E. Knapp,et al.  Automated mapping of tropical deforestation and forest degradation: CLASlite , 2009 .

[32]  A. Hudak,et al.  Nearest neighbor imputation of species-level, plot-scale forest structure attributes from LiDAR data , 2008 .

[33]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[34]  Luciano da Fontoura Costa,et al.  2D Euclidean distance transform algorithms: A comparative survey , 2008, CSUR.

[35]  A. Jarvis Hole-field seamless SRTM data, International Centre for Tropical Agriculture (CIAT) , 2008 .

[36]  A. Angelsen Moving ahead with REDD: issues, options and implications , 2008 .

[37]  D. R. Cutler,et al.  Utah State University From the SelectedWorks of , 2017 .

[38]  Jennifer A. Miller,et al.  Incorporating spatial dependence in predictive vegetation models , 2007 .

[39]  O. Phillips,et al.  Continental-scale patterns of canopy tree composition and function across Amazonia , 2006, Nature.

[40]  J. Chambers,et al.  Tree allometry and improved estimation of carbon stocks and balance in tropical forests , 2005, Oecologia.

[41]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[42]  R. Dubayah,et al.  Above-ground biomass estimation in closed canopy Neotropical forests using lidar remote sensing: factors affecting the generality of relationships , 2003 .

[43]  Kalle Ruokolainen,et al.  Dispersal, Environment, and Floristic Variation of Western Amazonian Forests , 2003, Science.

[44]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[45]  Hanna Tuomisto,et al.  Dissecting Amazonian Biodiversity , 1995, Science.