Monitoring Forest Health Using Hyperspectral Imagery: Does Feature Selection Improve the Performance of Machine-Learning Techniques?

This study analyzed highly-correlated, feature-rich datasets from hyperspectral remote sensing data using multiple machine and statistical-learning methods. The effect of filter-based feature-selection methods on predictive performance was compared. Also, the effect of multiple expert-based and data-driven feature sets, derived from the reflectance data, was investigated. Defoliation of trees (%) was modeled as a function of reflectance, and variable importance was assessed using permutation-based feature importance. Overall support vector machine (SVM) outperformed others such as random forest (RF), extreme gradient boosting (XGBoost), lasso (L1) and ridge (L2) regression by at least three percentage points. The combination of certain feature sets showed small increases in predictive performance while no substantial differences between individual feature sets were observed. For some combinations of learners and feature sets, filter methods achieved better predictive performances than the unfiltered feature sets, while ensemble filters did not have a substantial impact on performance. Permutation-based feature importance estimated features around the red edge to be most important for the models. However, the presence of features in the near-infrared region (800 nm - 1000 nm) was essential to achieve the best performances. More training data and replication in similar benchmarking studies is needed for more generalizable conclusions. Filter methods have the potential to be helpful in high-dimensional situations and are able to improve the interpretation of feature effects in fitted models, which is an essential constraint in environmental modeling studies.

[1]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[2]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[3]  Thibault Helleputte,et al.  Robust biomarker identification for cancer diagnosis with ensemble feature selection methods , 2010, Bioinform..

[4]  Alexander Brenning,et al.  Spatial analysis of the risk of major forest diseases in Monterey pine plantations , 2015 .

[5]  Philip A. Townsend,et al.  Estimating the effect of gypsy moth defoliation using MODIS , 2008 .

[6]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[7]  Jorge Cadima,et al.  Principal component analysis: a review and recent developments , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[8]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[9]  T. Higginbottom,et al.  Machine learning and multi-sensor based modelling of woody vegetation in the Molopo Area, South Africa , 2019, Remote Sensing of Environment.

[10]  Daniel W. Apley,et al.  Visualizing the effects of predictor variables in black box supervised learning models , 2016, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[11]  D. M. Moss,et al.  Red edge spectral measurements from sugar maple leaves , 1993 .

[12]  Peijun Du,et al.  Improving Random Forest With Ensemble of Features and Semisupervised Feature Extraction , 2015, IEEE Geoscience and Remote Sensing Letters.

[13]  Albert Y. Zomaya,et al.  Remote sensing big data computing: Challenges and opportunities , 2015, Future Gener. Comput. Syst..

[14]  Alexander Brenning,et al.  Spatial cross-validation and bootstrap for the assessment of prediction rules in remote sensing: The R package sperrorest , 2012, 2012 IEEE International Geoscience and Remote Sensing Symposium.

[15]  Sabine Vanhuysse,et al.  Less is more: optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban application , 2018 .

[16]  William Michael Landau,et al.  The drake R package: a pipeline toolkit for reproducibility and high-performance computing , 2018, J. Open Source Softw..

[17]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[18]  Verónica Bolón-Canedo,et al.  Ensembles for feature selection: A review and future trends , 2019, Inf. Fusion.

[19]  Sergey V. Alexandrov,et al.  Mapping individual tree health using full-waveform airborne laser scans and imaging spectroscopy: A case study for a floodplain eucalypt forest , 2016 .

[20]  P. Wężyk,et al.  Estimating defoliation of Scots pine stands using machine learning methods and vegetation indices of Sentinel-2 , 2018 .

[21]  C. Patten,et al.  Native rhizobacteria as biocontrol agents of Heterobasidion annosum s.s. and Armillaria mellea infection of Pinus radiata , 2016 .

[22]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[23]  Brandon M. Greenwell,et al.  A Simple and Effective Model-Based Variable Importance Measure , 2018, ArXiv.

[24]  Keqi Zhang,et al.  Remote sensing of seasonal changes and disturbances in mangrove forest: a case study from South Florida , 2016 .

[25]  Juan F. Ramirez Rochac,et al.  Feature extraction in hyperspectral imaging using adaptive feature selection approach , 2016, 2016 Eighth International Conference on Advanced Computational Intelligence (ICACI).

[26]  Fang Liu,et al.  Unsupervised feature selection based on maximum information and minimum redundancy for hyperspectral images , 2016, Pattern Recognit..

[27]  Clayton C. Kingdon,et al.  A general Landsat model to predict canopy defoliation in broadleaf deciduous forests , 2012 .

[28]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[29]  Shie Mannor,et al.  Statistical Optimization in High Dimensions , 2012, Oper. Res..

[30]  Sanmay Das,et al.  Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection , 2001, ICML.

[31]  A. Skidmore,et al.  Satellite-derived vegetation indices contribute significantly to the prediction of epiphyllous liverworts , 2014 .

[32]  Fu-Lai Chung,et al.  Multi-Label Bioinformatics Data Classification With Ensemble Embedded Feature Selection , 2019, IEEE Access.

[33]  Ursula Kälin,et al.  Defoliation estimation of forest trees from ground-level images , 2019 .

[34]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[35]  Michael S. Watt,et al.  A global climatic risk assessment of pitch canker disease. , 2009 .

[36]  Alexander Brenning,et al.  Using spectrotemporal indices to improve the fruit-tree crop classification accuracy , 2017 .

[37]  Bernd Bischl,et al.  mlr: Machine Learning in R , 2016, J. Mach. Learn. Res..

[38]  Shulin Wang,et al.  Feature selection in machine learning: A new perspective , 2018, Neurocomputing.

[39]  John R. Schott,et al.  Modeling forest defoliation using simulated BRDF and assessing its effect on reflectance and sensor reaching radiance , 2016, Optical Engineering + Applications.

[40]  Mohamed F. Ghalwash,et al.  Minimum redundancy maximum relevance feature selection approach for temporal gene expression data , 2017, BMC Bioinformatics.

[41]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[42]  Bernd Bischl,et al.  Benchmark for filter methods for feature selection in high-dimensional classification data , 2020, Comput. Stat. Data Anal..

[43]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[44]  Bernd Bischl,et al.  mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions , 2017, 1703.03373.

[45]  Gerard V. Trunk,et al.  A Problem of Dimensionality: A Simple Example , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Gavin Brown,et al.  Ensemble Learning , 2010, Encyclopedia of Machine Learning and Data Mining.

[47]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology High-Dimensional Regression and Variable Selection Using CAR Scores , 2011 .

[48]  Roberta E. Martin,et al.  A Tale of Two “Forests”: Random Forest Machine Learning Aids Tropical Forest Carbon Mapping , 2014, PloS one.

[49]  Ram Sarkar,et al.  Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods , 2018, Medical & Biological Engineering & Computing.

[50]  Bernd Bischl,et al.  Multi-objective hyperparameter tuning and feature selection using filter ensembles , 2020, GECCO.

[51]  Joanna Adamczyk,et al.  Red-edge vegetation indices for detecting and assessing disturbances in Norway spruce dominated mountain forests , 2015, Int. J. Appl. Earth Obs. Geoinformation.

[52]  Christian Berger,et al.  Surface Moisture and Vegetation Cover Analysis for Drought Monitoring in the Southern Kruger National Park Using Sentinel-1, Sentinel-2, and Landsat-8 , 2018, Remote. Sens..

[53]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[54]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[55]  Nicholas C. Coops,et al.  Digital aerial photogrammetry for assessing cumulative spruce budworm defoliation and enhancing forest inventories at a landscape-level , 2018 .

[56]  C. Patten,et al.  Biocontrol of Fusarium circinatum Infection of Young Pinus radiata Trees , 2017 .

[57]  Qianjun Zhao,et al.  Combining random forest and support vector machines for object-based rural-land-cover classification using high spatial resolution imagery , 2019, Journal of Applied Remote Sensing.

[58]  Mariana Belgiu,et al.  Random forest in remote sensing: A review of applications and future directions , 2016 .

[59]  Juraj Gazda,et al.  An experimental comparison of feature selection methods on two-class biomedical datasets , 2015, Comput. Biol. Medicine.

[60]  Alexander Brenning,et al.  Hyperparameter tuning and performance assessment of statistical and machine-learning algorithms using spatial data , 2019, Ecological Modelling.

[61]  Amir Hossein Alavi,et al.  Machine learning in geosciences and remote sensing , 2016 .

[62]  Henning Buddenbaum,et al.  Comparison of Feature Reduction Algorithms for Classifying Tree Species With Hyperspectral Data on Three Central European Test Sites , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[63]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[64]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[65]  I. Johnstone,et al.  Statistical challenges of high-dimensional data , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[66]  Alexander Brenning,et al.  Quantifying dwarf shrub biomass in an arid environment: comparing empirical methods in a high dimensional setting , 2015 .

[67]  Tomislav Hengl,et al.  Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation , 2018, Environ. Model. Softw..

[68]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[69]  Juraj Gazda,et al.  Heterogeneous ensemble feature selection based on weighted Borda count , 2017, 2017 9th International Conference on Information Technology and Electrical Engineering (ICITEE).

[70]  D. Horler,et al.  The red edge of plant leaf reflectance , 1983 .