Bayesian principal component regression model with spatial effects for forest inventory variables under small field sample size.

Abstract Remote sensing observations are extensively used for analysis of environmental variables. These variables often exhibit spatial correlation, which has to be accounted for in the calibration models used in predictions, either by direct modelling of the dependencies or by allowing for spatially correlated stochastic effects. Another feature in many remote sensing instruments is that the derived predictor variables are highly correlated, which can lead to unnecessary model over-training and at worst, singularities in the estimates. Both of these affect the prediction accuracy, especially when the training set for model calibration is small. To overcome these modelling challenges, we present a general model calibration procedure for remotely sensed data and apply it to airborne laser scanning data for forest inventory. We use a linear regression model that accounts for multicollinearity in the predictors by principal components and Bayesian regularization. It has a spatial random effect component for the spatial correlations that are not explained by a simple linear model. An efficient Markov chain Monte Carlo sampling scheme is used to account for the uncertainty in all the model parameters. We tested the proposed model against several alternatives and it outperformed the other linear calibration models, especially when there were spatial effects, multicollinearity and the training set size was small.

[1]  Noel Cressie,et al.  Spatio-Temporal Data Fusion for Very Large Remote Sensing Datasets , 2014, Technometrics.

[2]  Virpi Junttila,et al.  Linear Models for Airborne-Laser-Scanning-Based Operational Forest Inventory With Small Field Sample Size and Highly Correlated LiDAR Data , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[3]  Terje Gobakken,et al.  Reliability of LiDAR derived predictors of forest inventory attributes: A case study with Norway spruce , 2010 .

[4]  Sudipto Banerjee,et al.  Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets , 2014, Journal of the American Statistical Association.

[5]  J. Møller,et al.  Handbook of Spatial Statistics , 2008 .

[6]  H. Nagendra,et al.  Remote sensing for conservation monitoring: Assessing protected areas, habitat extent, habitat condition, species diversity, and threats , 2013 .

[7]  Heikki Haario,et al.  DRAM: Efficient adaptive MCMC , 2006, Stat. Comput..

[8]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[9]  George Marsaglia,et al.  A simple method for generating gamma variables , 2000, TOMS.

[10]  J. Hodges,et al.  Adding Spatially-Correlated Errors Can Mess Up the Fixed Effect You Love , 2010 .

[11]  R. Tibshirani,et al.  Prediction by Supervised Principal Components , 2006 .

[12]  Mariana Belgiu,et al.  Random forest in remote sensing: A review of applications and future directions , 2016 .

[13]  Kevin Winter,et al.  Remote sensing of cyanobacteria-dominant algal blooms and water quality parameters in Zeekoevlei, a small hypertrophic lake, using MERIS , 2010 .

[14]  Andrew O. Finley,et al.  A Bayesian approach to multi-source forest area estimation , 2008, Environmental and Ecological Statistics.

[15]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[16]  D. Turner,et al.  The role of remote sensing in process-scaling studies of managed forest ecosystems , 2015 .

[17]  R. Hall,et al.  Remote sensing and forest inventory for wildlife habitat assessment , 2009 .

[18]  J. Means Use of Large-Footprint Scanning Airborne Lidar To Estimate Forest Stand Characteristics in the Western Cascades of Oregon , 1999 .

[19]  Virpi Junttila,et al.  Sparse Bayesian Estimation of Forest Stand Characteristics from Airborne Laser Scanning , 2008 .

[20]  A. Finley,et al.  Strategies for minimizing sample size for use in airborne LiDAR-based forest inventory , 2013 .

[21]  D. W. MacFarlane,et al.  A Hierarchical Model for Quantifying Forest Variables Over Large Heterogeneous Landscapes With Uncertain Forest Areas , 2011, Journal of the American Statistical Association.

[22]  E. Næsset Determination of mean tree height of forest stands using airborne laser scanner data , 1997 .

[23]  Kenneth B. Pierce,et al.  Quantification of live aboveground forest biomass dynamics with Landsat time-series and field inventory data: A comparison of empirical modeling approaches , 2010 .

[24]  Virpi Junttila,et al.  Evaluating the Robustness of Plot Databases in Species-Specific Light Detection and Ranging-Based Forest Inventory , 2012 .

[25]  M. Maltamo,et al.  Airborne laser scanning based stand level management inventory in Finland. , 2011 .

[26]  Jungho Im,et al.  Forest biomass estimation from airborne LiDAR data using machine learning approaches , 2012 .

[27]  Jennifer L. R. Jensen,et al.  Estimation of biophysical characteristics for highly variable mixed-conifer stands using small-footprint lidar , 2006 .

[28]  Christopher J Paciorek,et al.  The importance of scale for spatial-confounding bias and precision of spatial regression estimators. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.