Leaf Area Index Estimation Algorithm for GF-5 Hyperspectral Data Based on Different Feature Selection and Machine Learning Methods

Leaf area index (LAI) is an essential vegetation parameter that represents the light energy utilization and vegetation canopy structure. As the only in-operation hyperspectral satellite launched by China, GF-5 is potentially useful for accurate LAI estimation. However, there is no research focus on evaluating GF-5 data for LAI estimation. Hyperspectral remote sensing data contains abundant information about the reflective characteristics of vegetation canopies, but these abound data also easily result in a dimensionality curse. Therefore, feature selection (FS) is necessary to reduce data redundancy to achieve more reliable estimations. Currently, machine learning (ML) algorithms have been widely used for FS. Moreover, the same ML algorithm is usually conducted for both FS and regression in LAI estimation. However, no evidence suggests that this is the optimal solution. Therefore, this study focuses on evaluating the capacity of GF-5 spectral reflectance for estimating LAI and the performances of different combination of FS and ML algorithms. Firstly, the PROSAIL model, which coupled leaf optical properties model PROSPECT and the scattering by arbitrarily inclined leaves (SAIL) model, was used to generate simulated GF-5 reflectance data under different vegetation and soil conditions, and then three FS methods, including random forest (RF), K-means clustering (K-means) and mean impact value (MIV), and three ML algorithms, including random forest regression (RFR), back propagation neural network (BPNN) and K-nearest neighbor (KNN) were used to develop nine LAI estimation models. The FS process was conducted twice using different strategies: Firstly, three FS methods were conducted to search the lowest dimension number, which maintained the estimation accuracy of all bands. Then, the sequential backward selection (SBS) method was used to eliminate the bands having minimal impact on LAI estimation accuracy. Finally, three best estimation models were selected and evaluated using reference LAI. The results showed that although the RF_RFR model (RF used for feature selection and RFR used for regression) achieved reliable LAI estimates (coefficient of determination (R²) = 0.828, root mean square error (RMSE) = 0.839), the poor performance (R² = 0.763, RMSE = 0.987) of the MIV_BPNN model (MIV used for feature selection and BPNN used for regression) suggested using feature selection and regression conducted by the same ML algorithm could not always ensure an optimal estimation. Moreover, RF selection preserved the most informative bands for LAI estimation so that each ML regression method could achieve satisfactory estimation results. Finally, the results indicated that the RF_KNN model (RF used as feature selection and KNN used for regression) with seven GF-5 spectral band reflectance achieved the better estimation results than others when validated by simulated data (R² = 0.834, RMSE = 0.824) and actual reference LAI (R² = 0.659, RMSE = 0.697).

[1]  L. Alonso,et al.  A red-edge spectral index for remote sensing estimation of green LAI over agroecosystems , 2013 .

[2]  Neal W. Aven,et al.  Mapping urban tree species using integrated airborne hyperspectral and LiDAR remote sensing data , 2017 .

[3]  José Moreno,et al.  Multi-Crop Green LAI Estimation with a New Simple Sentinel-2 LAI Index (SeLI) , 2019, Sensors.

[4]  A. Skidmore,et al.  Leaf Area Index derivation from hyperspectral vegetation indicesand the red edge position , 2009 .

[5]  Evelyne Vigneau,et al.  Random forests: A machine learning methodology to highlight the volatile organic compounds involved in olfactory perception , 2018, Food Quality and Preference.

[6]  Ning Zeng,et al.  Sensitivity of Tropical Land Climate to Leaf Area Index: Role of Surface Conductance versus Albedo* , 2004 .

[7]  D. Roberts,et al.  Using Imaging Spectroscopy to Study Ecosystem Processes and Properties , 2004 .

[8]  Xiang Zhao,et al.  Generating High Spatio-Temporal Resolution Fractional Vegetation Cover by Fusing GF-1 WFV and MODIS Data , 2019, Remote. Sens..

[9]  Sepideh Karimi,et al.  Generalizability of gene expression programming and random forest methodologies in estimating cropland and grassland leaf area index , 2018, Comput. Electron. Agric..

[10]  Eileen M. Perry,et al.  Spectral and spatial differences in response of vegetation indices to nitrogen treatments on apple , 2007 .

[11]  C. Atzberger,et al.  Spatially constrained inversion of radiative transfer models for improved LAI mapping from future Sentinel-2 imagery , 2012 .

[12]  Xinyu Wu,et al.  Dimensionality reduction of data sequences for human activity recognition , 2016, Neurocomputing.

[13]  T. Jarmer,et al.  Comparison of different regression models and validation techniques for the assessment of wheat leaf area index from hyperspectral data , 2015 .

[14]  Peter Reinartz,et al.  On the use of Sentinel-2 for coastal habitat mapping and satellite-derived bathymetry estimation using downscaled coastal aerosol band , 2019, Int. J. Appl. Earth Obs. Geoinformation.

[15]  Ming Xu,et al.  Estimating Yellow Starthistle (Centaurea solstitialis) Leaf Area Index and Aboveground Biomass with the Use of Hyperspectral Data , 2007, Weed Science.

[16]  Xiaoxia Wang,et al.  Comparison of Four Machine Learning Methods for Generating the GLASS Fractional Vegetation Cover Product from MODIS Data , 2016, Remote. Sens..

[17]  Jiye Liang,et al.  An efficient instance selection algorithm for k nearest neighbor regression , 2017, Neurocomputing.

[18]  Seyed Mohammad Mirjalili,et al.  Whale optimization approaches for wrapper feature selection , 2018, Appl. Soft Comput..

[19]  F. Baret,et al.  Review of methods for in situ leaf area index (LAI) determination: Part II. Estimation of LAI, errors and sampling , 2004 .

[20]  Philip Lewis,et al.  Variability and bias in active and passive ground-based measurements of effective plant, wood and leaf area index , 2018 .

[21]  Tim R. McVicar,et al.  Preprocessing EO-1 Hyperion hyperspectral data to support the application of agricultural indexes , 2003, IEEE Trans. Geosci. Remote. Sens..

[22]  P. Thenkabail,et al.  Hyperspectral Vegetation Indices and Their Relationships with Agricultural Crop Characteristics , 2000 .

[23]  Yadan Zhang,et al.  Non-invasive continuous blood pressure measurement based on mean impact value method, BP neural network, and genetic algorithm , 2018, Technology and health care : official journal of the European Society for Engineering and Medicine.

[24]  Gustau Camps-Valls,et al.  Mapping Leaf Area Index With a Smartphone and Gaussian Processes , 2015, IEEE Geoscience and Remote Sensing Letters.

[25]  Jean-Michel Poggi,et al.  Variable selection using random forests , 2010, Pattern Recognit. Lett..

[26]  S. Leblanc,et al.  Derivation and validation of Canada-wide coarse-resolution leaf area index maps using high-resolution satellite imagery and ground measurements , 2002 .

[27]  Gustau Camps-Valls,et al.  Hyperspectral dimensionality reduction for biophysical variable statistical retrieval , 2017 .

[28]  Ranga B. Myneni,et al.  Stochastic transport theory for investigating the three-dimensional canopy structure from space measurements , 2008 .

[29]  Petri Pellikka,et al.  Utility of hyperspectral compared to multispectral remote sensing data in estimating forest biomass and structure variables in Finnish boreal forest , 2019, Int. J. Appl. Earth Obs. Geoinformation.

[30]  Wolfram Mauser,et al.  Evaluation of the PROSAIL Model Capabilities for Future Hyperspectral Model Environments: A Review Study , 2018, Remote. Sens..

[31]  Marie Clément,et al.  Applications of random forest feature selection for fine‐scale genetic population assignment , 2017, Evolutionary applications.

[32]  J. Dungan,et al.  Generating global Leaf Area Index from Landsat: Algorithm formulation and demonstration , 2012 .

[33]  Robert Hecht-Nielsen III.3 – Theory of the Backpropagation Neural Network* , 1992 .

[34]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Jonas Sjöberg,et al.  Efficient training of neural nets for nonlinear adaptive filtering using a recursive Levenberg-Marquardt algorithm , 2000, IEEE Trans. Signal Process..

[36]  Jindi Wang,et al.  Use of General Regression Neural Networks for Generating the GLASS Leaf Area Index Product From Time-Series MODIS Surface Reflectance , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[37]  Zaheer Iqbal,et al.  Remote Sensing-Based Mapping of Senescent Leaf C: N Ratio in the Sundarbans Reserved Forest Using Machine Learning Techniques , 2020, Remote. Sens..

[38]  Shunlin Liang,et al.  Evaluation of four long time-series global leaf area index products , 2017 .

[39]  Roberta E. Martin,et al.  Multi-method ensemble selection of spectral bands related to leaf biochemistry , 2015 .

[40]  Hankui K. Zhang,et al.  Finer resolution observation and monitoring of global land cover: first mapping results with Landsat TM and ETM+ data , 2013 .

[41]  P. D. Heermann,et al.  Classification of multispectral remote sensing data using a back-propagation neural network , 1992, IEEE Trans. Geosci. Remote. Sens..

[42]  Lei Wang,et al.  Assimilation of the leaf area index and vegetation temperature condition index for winter wheat yield estimation using Landsat imagery and the CERES-Wheat model , 2017 .

[43]  Xiangdong Lei,et al.  Individual Tree Diameter Growth Models of Larch–Spruce–Fir Mixed Forests Based on Machine Learning Algorithms , 2019, Forests.

[44]  Dong Li,et al.  Combined Use of Airborne LiDAR and Satellite GF-1 Data to Estimate Leaf Area Index, Height, and Aboveground Biomass of Maize During Peak Growing Season , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[45]  Marco Heurich,et al.  Mapping leaf chlorophyll content from Sentinel-2 and RapidEye data in spruce stands using the invertible forest reflectance model , 2019, Int. J. Appl. Earth Obs. Geoinformation.

[46]  Lorenzo Bruzzone,et al.  Feature Selection Based on High Dimensional Model Representation for Hyperspectral Images , 2017, IEEE Transactions on Image Processing.

[47]  John Shepanski,et al.  Hyperion, a space-based imaging spectrometer , 2003, IEEE Trans. Geosci. Remote. Sens..

[48]  Hongxing Liu,et al.  Exploring the Potential of WorldView-2 Red-Edge Band-Based Vegetation Indices for Estimation of Mangrove Leaf Area Index with Machine Learning Algorithms , 2017, Remote. Sens..

[49]  Michel Verleysen,et al.  Kernel-based dimensionality reduction using Renyi's α-entropy measures of similarity , 2017, Neurocomputing.

[50]  Xiaoyu Song,et al.  Exploring the Best Hyperspectral Features for LAI Estimation Using Partial Least Squares Regression , 2014, Remote. Sens..

[51]  Helmi Zulhaidi Mohd Shafri,et al.  Spectral feature selection and classification of roofing materials using field spectroscopy data , 2015 .

[52]  Alfred Lenin Fred,et al.  AC coefficient and K-means cuckoo optimisation algorithm-based segmentation and compression of compound images , 2018, IET Image Process..

[53]  Bing Wang,et al.  Assessment of Sentinel-2 MSI Spectral Band Reflectances for Estimating Fractional Vegetation Cover , 2018, Remote. Sens..

[54]  Wolfram Mauser,et al.  Improvements of plant parameter estimations with hyperspectral data compared to multispectral data , 1997, Remote Sensing.

[55]  Clement Atzberger,et al.  LAI and chlorophyll estimation for a heterogeneous grassland using hyperspectral measurements , 2008 .

[56]  Chih-Jen Lin,et al.  Large-Scale Linear RankSVM , 2014, Neural Computation.

[57]  Quan Sun,et al.  ractional vegetation cover estimation in arid and semi-arid environments using J-1 satellite hyperspectral data , 2012 .

[58]  Fang Liu,et al.  Unsupervised feature selection based on maximum information and minimum redundancy for hyperspectral images , 2016, Pattern Recognit..

[59]  Guang Jin,et al.  Research on power energy load forecasting method based on KNN , 2019, International Journal of Ambient Energy.

[60]  Jan G. P. W. Clevers,et al.  Optical remote sensing and the retrieval of terrestrial vegetation bio-geophysical properties - A review , 2015 .

[61]  Yilin Zhao,et al.  The Design and Implementation of Intrusion Detection System based on Data Mining Technology , 2013 .

[62]  Damaris Zurell,et al.  Collinearity: a review of methods to deal with it and a simulation study evaluating their performance , 2013 .

[63]  Benjamin Munson,et al.  Supervised and unsupervised machine learning approaches to classifying chimpanzee vocalizations , 2018 .

[64]  Li Li,et al.  Monitoring maize growth conditions by training a BP neural network with remotely sensed vegetation temperature condition index and leaf area index , 2019, Comput. Electron. Agric..

[65]  O. Hagolle,et al.  LAI, fAPAR and fCover CYCLOPES global products derived from VEGETATION: Part 1: Principles of the algorithm , 2007 .

[66]  M. Ashton,et al.  Hyperion, IKONOS, ALI, and ETM+ sensors in the study of African rainforests , 2004 .

[67]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[68]  Kit Yan Chan,et al.  Main Effect Fine-tuning of the Mutation Operator and the Neighbourhood Function for Uncapacitated Facility Location Problems , 2006, Soft Comput..

[69]  M. Ashton,et al.  Accuracy assessments of hyperspectral waveband performance for vegetation analysis applications , 2004 .

[70]  W. Cohen,et al.  Hyperspectral versus multispectral data for estimating leaf area index in four different biomes , 2004 .

[71]  Tao Yu,et al.  Leaf Area Index Estimation Using Chinese GF-1 Wide Field View Data in an Agriculture Region , 2017, Sensors.

[72]  Hitendra Padalia,et al.  Evaluation of the Use of Hyperspectral Vegetation Indices for Estimating Mangrove Leaf Area Index in Middle Andaman Island, India , 2018, Remote Sensing Letters.

[73]  Mario Cunha,et al.  Retrieval of Maize Leaf Area Index Using Hyperspectral and Multispectral Data , 2018, Remote. Sens..

[74]  Jing Liu,et al.  Improving leaf area index (LAI) estimation by correcting for clumping and woody effects using terrestrial laser scanning , 2018, Agricultural and Forest Meteorology.

[75]  O. Mutanga,et al.  Estimating LAI and mapping canopy storage capacity for hydrological applications in wattle infested ecosystems using Sentinel-2 MSI derived red edge bands , 2018, GIScience & Remote Sensing.

[76]  Michel Verleysen,et al.  Nonlinear dimensionality reduction of data manifolds with essential loops , 2005, Neurocomputing.

[77]  R. Colombo,et al.  Inversion of a radiative transfer model with hyperspectral observations for LAI mapping in poplar plantations , 2004 .

[78]  Jing M. Chen,et al.  A robust leaf area index algorithm accounting for the expected errors in gap fraction observations , 2018 .

[79]  Bing Zhang,et al.  A novel two-step method for winter wheat-leaf chlorophyll content estimation using a hyperspectral vegetation index , 2014 .

[80]  W. Verhoef,et al.  PROSPECT+SAIL models: A review of use for vegetation characterization , 2009 .

[81]  P. Starks,et al.  Estimating leaf area index and aboveground biomass of grazing pastures using Sentinel-1, Sentinel-2 and Landsat images , 2019, ISPRS Journal of Photogrammetry and Remote Sensing.

[82]  Frédéric Baret,et al.  Fractional vegetation cover estimation algorithm for Chinese GF-1 wide field view data , 2016 .

[83]  A. Skidmore,et al.  Mapping grassland leaf area index with airborne hyperspectral imagery : a comparison study of statistical approaches and inversion of radiative transfer models , 2011 .

[84]  Xiaohuan Xi,et al.  Retrieving aboveground biomass of wetland Phragmites australis (common reed) using a combination of airborne discrete-return LiDAR and hyperspectral data , 2017, Int. J. Appl. Earth Obs. Geoinformation.

[85]  Mohammad Reza Keyvanpour,et al.  A New Feature Selection Method Based on Ant Colony and Genetic Algorithm on Persian Font Recognition , 2012 .

[86]  Samuel Corgne,et al.  Agricultural practices in grasslands detected by spatial remote sensing , 2014, Environmental Monitoring and Assessment.

[87]  Zhongguang Fu,et al.  Research on a feature selection method based on median impact value for modeling in thermal power plants , 2016 .

[88]  Jan G. P. W. Clevers,et al.  Experimental Sentinel-2 LAI estimation using parametric, non-parametric and physical retrieval methods - A comparison , 2015 .

[89]  S. Ollinger,et al.  DIRECT ESTIMATION OF ABOVEGROUND FOREST PRODUCTIVITY THROUGH HYPERSPECTRAL REMOTE SENSING OF CANOPY NITROGEN , 2002 .

[90]  Siti Khairunniza-Bejo,et al.  A comparative study on dimensionality reduction of dielectric spectral data for the classification of basal stem rot (BSR) disease in oil palm , 2020, Comput. Electron. Agric..

[91]  Roberta E. Martin,et al.  Airborne spectranomics: mapping canopy chemical and taxonomic diversity in tropical forests , 2009 .

[92]  Thomas Jarmer,et al.  High-Resolution UAV-Based Hyperspectral Imagery for LAI and Chlorophyll Estimations from Wheat for Yield Prediction , 2018, Remote. Sens..

[93]  Yingying Dong,et al.  Retrieval of crop biophysical parameters from Sentinel-2 remote sensing imagery , 2019, Int. J. Appl. Earth Obs. Geoinformation.

[94]  Li Wang,et al.  Estimation of paddy rice leaf area index using machine learning methods based on hyperspectral data from multi-year experiments , 2018, PloS one.

[95]  Lifeng Xi,et al.  Residual life predictions for ball bearings based on self-organizing map and back propagation neural network methods , 2007 .

[96]  Zhuang Wang,et al.  Scaling Up Kernel SVM on Limited Resources: A Low-Rank Linearization Approach , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[97]  Viswanathan Chinnusamy,et al.  Comparative analysis of index and chemometric techniques-based assessment of leaf area index (LAI) in wheat through field spectroradiometer, Landsat-8, Sentinel-2 and Hyperion bands , 2020, Geocarto International.

[98]  Deepak Khazanchi,et al.  Optimizing Feature Selection Using Particle Swarm Optimization and Utilizing Ventral Sides of Leaves for Plant Leaf Classification , 2016 .

[99]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[100]  K. Soudani,et al.  Comparative analysis of IKONOS, SPOT, and ETM+ data for leaf area index estimation in temperate coniferous and deciduous forest stands , 2006 .

[101]  A. Stewart Fotheringham,et al.  Geographically weighted regression and multicollinearity: dispelling the myth , 2016, J. Geogr. Syst..