Hyperspectral Estimation of Chlorophyll Content in Apple Tree Leaf Based on Feature Band Selection and the CatBoost Model

Leaf chlorophyll content (LCC) is a crucial indicator of nutrition in apple trees and can be applied to assess their growth status. Hyperspectral data can provide an important means for detecting the LCC in apple trees. In this study, hyperspectral data and the measured LCC were obtained. The original spectrum (OR) was pretreated using some spectral transformations. Feature bands were selected based on the competitive adaptive reweighted sampling (CARS) algorithm, random frog (RF) algorithm, elastic net (EN) algorithm, and the EN-RF and EN-CARS algorithms. Partial least squares regression (PLSR), random forest regression (RFR), and the CatBoost algorithm were used before and after grid search parameter optimization to estimate the LCC. The results revealed the following: (1) The spectrum after second derivative (SD) transformation had the highest correlation with LCC (–0.929); moreover, the SD-based model produced the highest accuracy, making SD an effective spectrum pretreatment method for apple tree LCC estimation. (2) Compared with the single band selection algorithm, the EN-RF algorithm had a better dimension reduction effect, and the modeling accuracy was generally higher. (3) CatBoost after grid search optimization had the best estimation effect, and the validation set of the SD-EN-CARS-CatBoost model after parameter optimization had the highest estimation accuracy, with the determination coefficient (R2), root mean square error (RMSE), and relative prediction deviation (RPD) reaching 0.923, 2.472, and 3.64, respectively. As such, the optimized SD-EN-CARS-CatBoost model, with its high accuracy and reliability, can be used to monitor the growth of apple trees, support the intelligent management of apple orchards, and facilitate the economic development of the fruit industry.

[1]  Lili Zhangzhong,et al.  Ensemble Learning Simulation Method for Hydraulic Characteristic Parameters of Emitters Driven by Limited Data , 2023, Agronomy.

[2]  Bappa Das,et al.  Comparative Analysis of Statistical and Machine Learning Techniques for Rice Yield Forecasting for Chhattisgarh, India , 2023, Sustainability.

[3]  Guijun Yang,et al.  Estimation of Leaf Nitrogen Content in Rice Using Vegetation Indices and Feature Variable Optimization with Information Fusion of Multiple-Sensor Images from UAV , 2023, Remote. Sens..

[4]  Jiquan Zhang,et al.  Retrieving SPAD Values of Summer Maize Using UAV Hyperspectral Data Based on Multiple Machine Learning Algorithm , 2022, Remote. Sens..

[5]  Guofeng Wu,et al.  Mapping mangrove leaf area index (LAI) by combining remote sensing images with PROSAIL‐D and XGBoost methods , 2022, Remote Sensing in Ecology and Conservation.

[6]  Jian Li,et al.  Inversion study of cadmium content in soil based on reflection spectroscopy and MSC-ELM model. , 2022, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[7]  Xinyue Yang,et al.  Estimation algorithm for chlorophyll-a concentrations in water from hyperspectral images based on feature derivation and ensemble learning , 2022, Ecol. Informatics.

[8]  Han Liu,et al.  Hyperspectral estimation of petroleum hydrocarbon content in soil using ensemble learning method and LASSO feature extraction , 2022, Environmental Pollutants and Bioavailability.

[9]  Wu-chun Feng,et al.  Suitability of different multivariate analysis methods for monitoring leaf N accumulation in winter wheat using in situ hyperspectral data , 2022, Comput. Electron. Agric..

[10]  Maofang Gao,et al.  Winter wheat chlorophyll content retrieval based on machine learning using in situ hyperspectral data , 2022, Comput. Electron. Agric..

[11]  Xin Lv,et al.  Estimation of Cotton Leaf Area Index (LAI) Based on Spectral Transformation and Vegetation Index , 2021, Remote. Sens..

[12]  Z. Zang,et al.  A Machine-Learning Approach Combining Wavelet Packet Denoising with Catboost for Weather Forecasting , 2021, Atmosphere.

[13]  Qingrui Chang,et al.  Identification and Severity Monitoring of Maize Dwarf Mosaic Virus Infection Based on Hyperspectral Measurements , 2021, Remote. Sens..

[14]  Maofang Gao,et al.  Hyperspectral inversion of nitrogen content in maize leaves based on different dimensionality reduction algorithms , 2021, Comput. Electron. Agric..

[15]  Xianhe Cheng,et al.  Estimating the maize biomass by crop height and narrowband vegetation indices derived from UAV-based hyperspectral images , 2021 .

[16]  Qingrui Chang,et al.  Estimation of Apple Tree Leaf Chlorophyll Content Based on Machine Learning Methods , 2021, Remote. Sens..

[17]  Fei Meng,et al.  Application of hyperspectral analysis of chlorophyll a concentration inversion in Nansi Lake , 2021, Ecol. Informatics.

[18]  Chunjiang Zhao,et al.  Estimation of Apple Flowering Frost Loss for Fruit Yield Based on Gridded Meteorological and Remote Sensing Data in Luochuan, Shaanxi Province, China , 2021, Remote. Sens..

[19]  Brett Whelan,et al.  Machine Learning Optimised Hyperspectral Remote Sensing Retrieves Cotton Nitrogen Status , 2021, Remote. Sens..

[20]  Lin Du,et al.  Improving the Selection of Vegetation Index Characteristic Wavelengths by Using the PROSPECT Model for Leaf Water Content Estimation , 2021, Remote. Sens..

[21]  Feili Li,et al.  Simultaneous detection of trace adulterants in food based on multi-molecular infrared (MM-IR) spectroscopy. , 2021, Talanta.

[22]  Wenting Guo,et al.  Leaf area index estimation model for UAV image hyperspectral data based on wavelength variable selection and machine learning methods , 2020, Plant Methods.

[23]  Ting Yang,et al.  Estimating leaf chlorophyll content of crops via optimal unmanned aerial vehicle hyperspectral data at multi-scales , 2020, Comput. Electron. Agric..

[24]  Rui Wang,et al.  Estimation of the vertically integrated leaf nitrogen content in maize using canopy hyperspectral red edge parameters , 2020 .

[25]  Liping Di,et al.  Estimating Crop LAI Using Spectral Feature Extraction and the Hybrid Inversion Method , 2020, Remote. Sens..

[26]  Minzan Li,et al.  Analysis of Chlorophyll Concentration in Potato Crop by Coupling Continuous Wavelet Transform and Spectral Variable Optimization , 2020, Remote. Sens..

[27]  John T. Hancock,et al.  CatBoost for big data: an interdisciplinary review , 2020, Journal of Big Data.

[28]  Muhammad Adeel Hassan,et al.  Assessment of Water and Nitrogen Use Efficiencies Through UAV-Based Multispectral Phenotyping in Winter Wheat , 2020, Frontiers in Plant Science.

[29]  Ansar Ali,et al.  Evaluating the potential of red edge position (REP) of hyperspectral remote sensing data for real time estimation of LAI & chlorophyll content of kinnow mandarin (Citrus reticulata) fruit orchards , 2020, Scientia Horticulturae.

[30]  Zhenhai Li,et al.  Progress of hyperspectral data processing and modelling for cereal crop nitrogen monitoring , 2020, Comput. Electron. Agric..

[31]  Naoto Yokoya,et al.  Comparison of Machine Learning Methods for Estimating Mangrove Above-Ground Biomass Using Multiple Source Remote Sensing Data in the Red River Delta Biosphere Reserve, Vietnam , 2020, Remote. Sens..

[32]  Ngai Paing Tan,et al.  Hyperspectral remote sensing for assessment of chlorophyll sufficiency levels in mature oil palm (Elaeis guineensis) based on frond numbers: Analysis of decision tree and random forest , 2020, Comput. Electron. Agric..

[33]  Flavio Esposito,et al.  Soybean yield prediction from UAV using multimodal data fusion and deep learning , 2020 .

[34]  Jingming Ning,et al.  Highly identification of keemun black tea rank based on cognitive spectroscopy: Near infrared spectroscopy combined with feature variable selection. , 2020, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[35]  Hui Jiang,et al.  Quantitative analysis of yeast fermentation process using Raman spectroscopy: Comparison of CARS and VCPA for variable selection. , 2019, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[36]  Chunjiang Zhao,et al.  Estimation of maize above-ground biomass based on stem-leaf separation strategy integrated with LiDAR and optical remote sensing data , 2019, PeerJ.

[37]  A. B. M. Shawkat Ali,et al.  A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in Wheat , 2019, Remote. Sens..

[38]  Lifeng Wu,et al.  Potential of kernel-based nonlinear extension of Arps decline model and gradient boosting with categorical features support for predicting daily global solar radiation in humid regions , 2019, Energy Conversion and Management.

[39]  Wenjiang Huang,et al.  A Comparison of Hybrid Machine Learning Algorithms for the Retrieval of Wheat Biophysical Variables from Sentinel-2 , 2019, Remote. Sens..

[40]  Deng Limiao,et al.  Application driven key wavelengths mining method for aflatoxin detection using hyperspectral data , 2018, Comput. Electron. Agric..

[41]  Habshah Midi,et al.  Robust generalized multiplicative scatter correction algorithm on pretreatment of near infrared spectral data , 2018, Vibrational Spectroscopy.

[42]  Shujing Cao,et al.  Estimating apple tree canopy chlorophyll content based on Sentinel-2A remote sensing imaging , 2018, Scientific Reports.

[43]  Frédéric Baret,et al.  Estimating leaf chlorophyll content in sugar beet canopies using millimeter- to centimeter-scale reflectance imagery , 2017 .

[44]  S. Masri,et al.  Enabling reduced-order data-driven nonlinear identification and modeling through naïve elastic net regularization , 2017 .

[45]  Yufeng Ge,et al.  Temporal dynamics of maize plant growth, water use, and leaf water content using automated high throughput RGB and hyperspectral imaging , 2016, Comput. Electron. Agric..

[46]  Quansheng Chen,et al.  Identification of solid state fermentation degree with FT-NIR spectroscopy: Comparison of wavelength variable selection methods of CARS and SCARS. , 2015, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[47]  Roberta E. Martin,et al.  Multi-method ensemble selection of spectral bands related to leaf biochemistry , 2015 .

[48]  Yong He,et al.  Determination of tea polyphenols content by infrared spectroscopy coupled with iPLS and random frog techniques , 2015, Comput. Electron. Agric..

[49]  Baohua Zhang,et al.  Prediction of Soluble Solids Content and Firmness of Pears Using Hyperspectral Reflectance Imaging , 2015, Food Analytical Methods.

[50]  Guofeng Wu,et al.  Visible and near-infrared reflectance spectroscopy-an alternative for monitoring soil contamination by heavy metals. , 2014, Journal of hazardous materials.

[51]  Dong-Sheng Cao,et al.  An efficient method of wavelength interval selection based on random frog for multivariate spectral calibration. , 2013, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[52]  Gwendal Latouche,et al.  A new optical leaf-clip meter for simultaneous non-destructive assessment of leaf chlorophyll and epidermal flavonoids , 2012, Physiologia plantarum.

[53]  Zhenxian Zhang,et al.  Effect of Low Light on the Characteristics of Photosynthesis and Chlorophyll a Fluorescence During Leaf Development of Sweet Pepper , 2012 .

[54]  Qing-Song Xu,et al.  Random frog: an efficient reversible jump Markov Chain Monte Carlo-like approach for variable selection with applications to gene selection and disease classification. , 2012, Analytica chimica acta.

[55]  Onisimo Mutanga,et al.  High density biomass estimation for wetland vegetation using WorldView-2 imagery and random forest regression algorithm , 2012, Int. J. Appl. Earth Obs. Geoinformation.

[56]  Anatoly A. Gitelson,et al.  Remote estimation of gross primary productivity in soybean and maize based on total crop chlorophyll content , 2012 .

[57]  Luda Zhang,et al.  [Near-infrared spectrum quantitative analysis model based on principal components selected by elastic net]. , 2010, Guang pu xue yu guang pu fen xi = Guang pu.

[58]  Weixing Cao,et al.  Monitoring leaf pigment status with hyperspectral remote sensing in wheat , 2008 .

[59]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[60]  A. Viña,et al.  Remote estimation of canopy chlorophyll content in crops , 2005 .

[61]  Ban-Dar Hsu,et al.  Photosynthetic plasticity of Phalaenopsis in response to different light environments. , 2004, Journal of plant physiology.

[62]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[63]  L. Breiman Random Forests , 2001, Encyclopedia of Machine Learning and Data Mining.

[64]  A. Gitelson,et al.  Remote estimation of chlorophyll content in higher plant leaves , 1997 .

[65]  Ron Kohavi,et al.  Oblivious Decision Trees, Graphs, and Top-Down Pruning , 1995, IJCAI.

[66]  Minzan Li,et al.  Improvement of chlorophyll content estimation on maize leaf by vein removal in hyperspectral image , 2021, Comput. Electron. Agric..

[67]  Shah Fahad,et al.  Predicting copper content in chicory leaves using hyperspectral data with continuous wavelet transforms and partial least squares , 2021, Comput. Electron. Agric..

[68]  Meichen Feng,et al.  Estimation of water content in corn leaves using hyperspectral data based on fractional order Savitzky-Golay derivation coupled with wavelength selection , 2021, Comput. Electron. Agric..

[69]  Wenting Han,et al.  Evaluating the sensitivity of water stressed maize chlorophyll and structure based on UAV derived vegetation indices , 2021, Comput. Electron. Agric..

[70]  Yanbing Qi,et al.  Retrieval and Mapping of Soil Organic Carbon Using Sentinel-2A Spectral Images from Bare Cropland in Autumn , 2021, Remote. Sens..

[71]  A. Gonsamo,et al.  The global distribution of leaf chlorophyll content , 2020 .

[72]  Yuri A. Gritz,et al.  Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. , 2003, Journal of plant physiology.

[73]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .