Use of advanced modelling methods to estimate radiata pine productivity indices

Abstract Site productivity indices have been widely used to describe age normalised height and volume for a range of forest species. In this study we used a wide range of modelling methods to predict Site Index and 300 Index for Pinus radiata D. Don. Site Index normalises height to a standardised age, while the 300 Index normalises volume measurements to a standardised age, stand density and set of silvicultural conditions. These two indices were derived from a national database of 3,676 plots with predictors extracted from geospatial surfaces describing key landform, topographic, climatic, edaphic and species-specific features (e.g. disease severity). Using these data, our objectives were to (i) compare the accuracy of geospatial, parametric and non-parametric models in predicting Site Index and 300 Index, (ii) determine whether regression kriging could be used to improve the accuracy of these predictions, (iii) identify the most influential predictors of these two indices and (iv) produce maps of both indices across New Zealand. All predictions were made on a test dataset (n = 1,104) that was not used for model fitting. The two non-parametric models eXtreme Gradient Boosting (XGBoost) and random forest provided the most precise predictions of Site Index and 300 Index and markedly outperformed both parametric and geospatial models (ordinary kriging, inverse distance weighting). Random forest provided the most precise predictions of Site Index (R2 = 0.811, RMSE = 2.027 m, RMSE% = 6.73%) while XGBoost most precisely predicted 300 Index (R2 = 0.676, RMSE = 3.462 m3 ha−1 yr−1, RMSE% = 12.63%). The use of regression kriging improved the fit of all but one model through accounting for spatial co-variance in the model error. Gains in precision were most marked for the parametric models, and in particular the regression model. After kriging, the three most precise models for both indices were random forest, followed by XGBoost and the regression model. An ensemble model derived from the mean predictions of these three models provided the most precise predictions, among all tested models, for both Site Index (R2 = 0.818, RMSE = 1.991 m, RMSE% = 6.61%) and 300 Index (R2 = 0.691, RMSE = 3.384 m3 ha−1 yr−1, RMSE% = 12.35%). Fitting a range of models to productivity indices was found to be a useful approach as this allows creation of an ensemble model and provides greater insight into the key determinants of productivity.

[1]  Yu. N. Blagoveshchenskii,et al.  Use of empirical Bayesian kriging for revealing heterogeneities in the distribution of organic carbon on agricultural lands , 2017, Eurasian Soil Science.

[2]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[3]  M. Kimberley,et al.  Assessing prediction accuracy in a regression kriging surface of Pinus radiata outerwood density across New Zealand , 2013 .

[4]  Jerome K. Vanclay,et al.  Forest site productivity: a review of the evolution of dendrometric concepts for even-aged stands , 2008 .

[5]  A. Hewitt New Zealand soil classification. , 1993 .

[6]  A. Hense,et al.  A Bayesian approach to climate model evaluation and multi‐model averaging with an application to global mean surface temperatures from IPCC AR4 coupled climate models , 2006 .

[7]  Michael S. Watt,et al.  Comparing parametric and non-parametric methods of predicting Site Index for radiata pine using combinations of data derived from environmental surfaces, satellite imagery and airborne laser scanning , 2015 .

[8]  G. Tutz,et al.  An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. , 2009, Psychological methods.

[9]  D. Donald,et al.  Management of Radiata Pine , 1993 .

[10]  M. Watt,et al.  Use of regression kriging to develop a Carbon:Nitrogen ratio surface for New Zealand , 2012 .

[11]  M. Kimberley,et al.  Characterising prediction error as a function of scale in spatial surfaces of tree productivity , 2017, New Zealand Journal of Forestry Science.

[12]  Mark O. Kimberley,et al.  Spatial prediction of optimal final stand density for even-aged plantation forests using productivity indices , 2017 .

[13]  S. Wold,et al.  Some recent developments in PLS modeling , 2001 .

[14]  Charles O. Sabatia,et al.  Predicting site index of plantation loblolly pine from biophysical variables , 2014 .

[15]  M. O. Kimberley,et al.  HEIGHT GROWTH OF PINUS RADIATA AS AFFECTED BY STOCKING , 1995 .

[16]  J. P. Skovsgaard,et al.  Assessing the quality of permanent sample plot databases for growth modelling in forest plantations , 1995 .

[17]  M. Kimberley,et al.  Predicting the spatial distribution of Sequoia sempervirens productivity in New Zealand , 2012 .

[18]  F. Telewski,et al.  Thigmomorphogenesis: anatomical, morphological and mechanical analysis of genetically different sibs of Pinus taeda in response to mechanical perturbation. , 1986, Physiologia plantarum.

[19]  M. Watt,et al.  Use of a process-based model to describe spatial variation in Pinus radiata productivity in New Zealand , 2011 .

[20]  L. Pienaar,et al.  The effect of planting density on dominant height in unthinned slash pine plantations , 1984 .

[21]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[22]  Budiman Minasny,et al.  Predicting and mapping the soil available water capacity of Australian wheatbelt , 2014 .

[23]  John R. Dymond,et al.  An erosion model for evaluating regional land-use scenarios , 2010, Environ. Model. Softw..

[24]  Ingvar Nilsson,et al.  Indices for nitrogen status and nitrate leaching from Norway spruce (Picea abies (L.) Karst.) stands in Sweden , 2002 .

[25]  Randy A. Dahlgren,et al.  N and P in New Zealand Soil Chronosequences and Relationships with Foliar N and P , 2005 .

[26]  F. Telewski Structure and function of flexure wood in Abies fraseri. , 1989, Tree physiology.

[27]  John Triantafilis,et al.  Digital Mapping of Soil Classes Using Ensemble of Models in Isfahan Region, Iran , 2019, Soil Systems.

[28]  Jürgen Pilz,et al.  Why do we need and how should we implement Bayesian kriging methods , 2008 .

[29]  Emil Pitkin,et al.  Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation , 2013, 1309.6392.

[30]  R. Benestad Empirically downscaled temperature scenarios for northern Europe based on a multi-model ensemble , 2002 .

[31]  B. Muys,et al.  Comparison and ranking of different modelling techniques for prediction of site index in Mediterranean mountain forests , 2010 .

[32]  R. Rodríguez-Soalleiro,et al.  Influence of edaphic factors and tree nutritive status on the productivity of Pinus radiata D. Don plantations in northwestern Spain , 2002 .

[33]  J. Gégout,et al.  Is the spatial distribution of European beech (Fagus sylvatica L.) limited by its potential height growth? , 2008 .

[34]  Jos Van Orshoven,et al.  Evaluation of modelling techniques for forest site productivity prediction in contrasting ecoregions using stochastic multicriteria acceptability analysis (SMAA) , 2011, Environ. Model. Softw..

[35]  M. Kimberley,et al.  Comparison of spatial prediction techniques for developing Pinus radiata productivity surfaces across New Zealand , 2009 .

[36]  S. WattMichael,et al.  Predicting the severity of Cyclaneusma minus on Pinus radiata under current climate in New Zealand , 2012 .

[37]  Ronald M. Lanner,et al.  On the insensitivity of height growth to spacing , 1985 .

[38]  J. Socha Effect of topography and geology on the site index of Picea abies in the West Carpathian, Poland , 2008 .

[39]  A. McBratney,et al.  Further results on prediction of soil properties from terrain attributes: heterotopic cokriging and regression-kriging , 1995 .

[40]  M. Kimberley,et al.  Development of models to predict Pinus radiata productivity throughout New Zealand , 2010 .

[41]  H. Hasenauer,et al.  Variation in potential volume yield of loblolly pine plantations , 1994 .

[42]  P. Krestov,et al.  Trembling aspen site index in relation to environmental measures of site quality at two spatial scales , 2002 .

[43]  Guofeng Cao,et al.  Improve ground-level PM2.5 concentration mapping using a random forests-based geostatistical approach. , 2018, Environmental Pollution.

[44]  G. Wang White spruce site index in relation to soil, understory vegetation, and foliar nutrients , 1995 .

[45]  Brandon M. Greenwell,et al.  Hands-On Machine Learning with R , 2019 .

[46]  P. Burrough,et al.  Principles of geographical information systems , 1998 .

[47]  U. Diéguez-Aranda,et al.  Exploring the use of learning techniques for relating the site index of radiata pine stands with climate, soil and physiography , 2020 .

[48]  S. Huang,et al.  Predicting lodgepole pine site index from climatic parameters in Alberta , 2006 .

[49]  M. Kimberley,et al.  Predicting the spatial distribution of Cupressus lusitanica productivity in New Zealand , 2009 .

[50]  J. Bontemps,et al.  Predictive approaches to forest site productivity: recent trends, challenges and future perspectives , 2014 .

[51]  Philipp Probst,et al.  Hyperparameters and tuning strategies for random forest , 2018, WIREs Data Mining Knowl. Discov..

[52]  H. Moradkhani,et al.  Assessing the uncertainties of hydrologic model selection in climate change impact studies , 2011 .

[53]  Jacobs,et al.  The effect of wind sway on the form and development of Pinus radiata D. Don , 1954 .

[54]  D. Lew,et al.  Climate for crops: integrating climate data with information about soils and crop requirements to reduce risks in agricultural decision‐making , 2006 .

[55]  L. Guindon,et al.  Predicting productivity of trembling aspen in the Boreal Shield ecozone of Quebec using different sources of soil and site information , 2009 .

[56]  R. Parfitt,et al.  Relationships between soil biota, nitrogen and phosphorus availability, and pasture growth under organic and conventional management , 2005 .

[57]  S. Huang,et al.  Lodgepole pine site index in relation to synoptic measures of climate, soil moisture and soil nutrients , 2004 .

[58]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[59]  J. Friedman Multivariate adaptive regression splines , 1990 .

[60]  Christine L. Goodale,et al.  THE LONG-TERM EFFECTS OF LAND-USE HISTORY ON NITROGEN CYCLING IN NORTHERN HARDWOOD FORESTS , 2001 .

[61]  M. Goodchild,et al.  Geographic Information Systems and Science (second edition) , 2001 .

[62]  Jasper A. Vrugt,et al.  Comparison of point forecast accuracy of model averaging methods in hydrologic applications , 2010 .

[63]  Michael Olusegun Akinwande,et al.  Variance Inflation Factor: As a Condition for the Inclusion of Suppressor Variable(s) in Regression Analysis , 2015 .

[64]  Philippe Lagacherie,et al.  Using quantile regression forest to estimate uncertainty of digital soil mapping products , 2017 .

[65]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[66]  John P. Wilson,et al.  Terrain analysis : principles and applications , 2000 .

[67]  Alexander Gribov,et al.  Empirical Bayesian kriging implementation and usage. , 2020, The Science of the total environment.

[68]  L. Wang,et al.  Potential impacts of regional climate change on site productivity of Larix olgensis plantations in northeast China , 2015 .

[69]  D. Paré,et al.  Productivity of black spruce and Jack pine stands in Quebec as related to climate, site biological features and soil properties , 2004 .

[70]  T. Nakajima,et al.  Estimating site index from ecological factors for industrial tree plantation species in Mindanao, Philippines , 2015 .

[71]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[72]  M. Kimberley,et al.  A National height-age model for Pinus radiata in New Zealand , 2013, New Zealand Journal of Forestry Science.

[73]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[74]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[75]  J. Leathwick,et al.  Climate Surfaces for New Zealand , 2002 .

[76]  G. Heuvelink,et al.  Mapping Soil Properties of Africa at 250 m Resolution: Random Forests Significantly Improve Current Predictions , 2015, PloS one.

[77]  Michael S. Watt,et al.  Predicting the severity of Dothistroma on Pinus radiata under current climate in New Zealand , 2011 .

[78]  John Elder,et al.  Handbook of Statistical Analysis and Data Mining Applications , 2009 .

[79]  Ying Liu,et al.  Integrate machine learning and geostatistics for high-resolution mapping of ground-level PM2.5 concentrations , 2020 .

[80]  Philippe Lagacherie,et al.  Prediction of topsoil texture for Region Centre (France) applying model ensemble methods , 2017 .

[81]  G. Nigh,et al.  Climate and Productivity of Major Conifer Species in the Interior of British Columbia, Canada , 2004, Forest Science.

[82]  G. Lawrence,et al.  Mineralization and nitrification patterns at eight northeastern USA forested research sites , 2004 .

[83]  M. Watt,et al.  Identification of key soil indicators influencing plantation productivity and sustainability across a national trial series in New Zealand , 2008 .

[84]  M. Tomé,et al.  Modelling the Douglas-fir (Pseudotsuga menziesii (Mirb.) Franco) site index from site factors in Portugal , 2003 .

[85]  Philippe Lagacherie,et al.  Evaluating Digital Soil Mapping approaches for mapping GlobalSoilMap soil properties from legacy data in Languedoc-Roussillon (France) , 2015 .

[86]  J. Gégout,et al.  Picea abies site index prediction by environmental factors and understorey vegetation: a two-scale approach based on survey databases , 2005 .

[87]  A. Dunningham,et al.  An atlas of radiata pine nutrition in New Zealand. , 1991 .

[88]  T. Eid,et al.  Site index prediction from site and climate variables for Norway spruce and Scots pine in Norway , 2012 .

[89]  A. R. Gibson,et al.  PREDICTING PINUS RADIATA SITE INDEX FROM ENVIRONMENTAL VARIABLES , 1984 .

[90]  M. Watt,et al.  The influence of wind on branch characteristics of Pinus radiata , 2004, Trees.

[91]  F. Telewski,et al.  Thigmomorphogenesis: field and laboratory studies of Abies fraseri in response to wind or mechanical perturbation. , 1986, Physiologia plantarum.

[92]  G. Heuvelink,et al.  A generic framework for spatial prediction of soil variables based on regression-kriging , 2004 .

[93]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[94]  David Pont,et al.  Forest-Scale Phenotyping: Productivity Characterisation Through Machine Learning , 2020, Frontiers in Plant Science.

[95]  Michael S. Watt,et al.  Moving beyond simple linear allometric relationships between tree height and diameter , 2011 .

[96]  J. Gallant,et al.  A multiresolution index of valley bottom flatness for mapping depositional areas , 2003 .