Penalized regression techniques for prediction: a case study for predicting tree mortality using remotely sensed vegetation indices

Constructing models can be complicated when the available fitting data are highly correlated and of high dimension. However, the complications depend on whether the goal is prediction instead of estimation. We focus on predicting tree mortality (measured as the number of dead trees) from change metrics derived from moderate-resolution imaging spectroradiometer satellite images. The high dimensionality and multicollinearity inherent in such data are of particular concern. Standard regression techniques perform poorly for such data, so we examine shrinkage regression techniques such as ridge regression, the LASSO, and partial least squares, which yield more robust predictions. We also suggest efficient strategies that can be used to select optimal models such as 0.632+ bootstrap and generalized cross validation. The techniques are compared using simulations. The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata D. Don plantation in southern New South Wales, Austr...

[1]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[2]  Joanne C. White,et al.  Multi-temporal analysis of high spatial resolution imagery for disturbance monitoring , 2008 .

[3]  Joanne C. White,et al.  Detecting mountain pine beetle red attack damage with EO‐1 Hyperion moisture indices , 2007 .

[4]  Philip A. Townsend,et al.  Estimating the effect of gypsy moth defoliation using MODIS , 2008 .

[5]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[6]  Jan Verbesselt,et al.  Forecasting tree mortality using change metrics derived from MODIS satellite data , 2009 .

[7]  Wenjiang J. Fu Nonlinear GCV and quasi-GCV for shrinkage models , 2005 .

[8]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[9]  A. Tikhonov On the stability of inverse problems , 1943 .

[10]  Alan J. Miller Subset Selection in Regression , 1992 .

[11]  L. Carrascal,et al.  Partial least squares regression as an alternative to current regression methods used in ecology , 2009 .

[12]  Roberta E. Martin,et al.  Spectral and chemical analysis of tropical forests: Scaling from leaf to canopy levels , 2008 .

[13]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[14]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[15]  H. Akaike A new look at the statistical model identification , 1974 .

[16]  Brian R. Sturtevant,et al.  Estimation of forest structural parameters using 5 and 10 meter SPOT-5 satellite data , 2009 .

[17]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[18]  Wenjiang J. Fu Penalized Regressions: The Bridge versus the Lasso , 1998 .

[19]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[20]  M. R. Osborne,et al.  On the LASSO and its Dual , 2000 .

[21]  Nicholas C. Coops,et al.  Prediction of eucalypt foliage nitrogen content from satellite-derived hyperspectral data , 2003, IEEE Trans. Geosci. Remote. Sens..

[22]  A. Huete,et al.  Overview of the radiometric and biophysical performance of the MODIS vegetation indices , 2002 .

[23]  Michael A. Wulder,et al.  Surveying mountain pine beetle damage of forests: A review of remote sensing opportunities , 2006 .

[24]  Robert H. Fraser,et al.  Mapping insect‐induced tree defoliation and mortality using coarse spatial resolution satellite imagery , 2005 .

[25]  Gene H. Golub,et al.  Generalized cross-validation as a method for choosing a good ridge parameter , 1979, Milestones in Matrix Computation.

[26]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[27]  M. R. Osborne,et al.  A new approach to variable selection in least squares problems , 2000 .

[28]  Russell Turner,et al.  Integrating plantation health surveillance and wood resource inventory systems using remote sensing , 2008 .

[29]  Clayton C. Kingdon,et al.  Remote sensing of the distribution and abundance of host species for spruce budworm in Northern Minnesota and Ontario , 2008 .

[30]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[31]  Nicholas C. Coops,et al.  Integrating remotely sensed and ancillary data sources to characterize a mountain pine beetle infestation , 2006 .

[32]  N. Coops,et al.  Estimation of insect infestation dynamics using a temporal sequence of Landsat data , 2008 .

[33]  Marie-Louise Smith,et al.  Analysis of hyperspectral data for estimation of temperate forest canopy nitrogen concentration: comparison between an airborne (AVIRIS) and a spaceborne (Hyperion) sensor , 2003, IEEE Trans. Geosci. Remote. Sens..