Hyperspectral Leaf Reflectance as Proxy for Photosynthetic Capacities: An Ensemble Approach Based on Multiple Machine Learning Algorithms

Global agriculture production is challenged by increasing demands from rising population and a changing climate, which may be alleviated through development of genetically improved crop cultivars. Research into increasing photosynthetic energy conversion efficiency has proposed many strategies to improve production but have yet to yield real-world solutions, largely because of a phenotyping bottleneck. Partial least squares regression (PLSR) is a statistical technique that is increasingly used to relate hyperspectral reflectance to key photosynthetic capacities associated with carbon uptake (maximum carboxylation rate of Rubisco, Vc,max) and conversion of light energy (maximum electron transport rate supporting RuBP regeneration, Jmax) to alleviate this bottleneck. However, its performance varies significantly across different plant species, regions, and growth environments. Thus, to cope with the heterogeneous performances of PLSR, this study aims to develop a new approach to estimate photosynthetic capacities. A framework was developed that combines six machine learning algorithms, including artificial neural network (ANN), support vector machine (SVM), least absolute shrinkage and selection operator (LASSO), random forest (RF), Gaussian process (GP), and PLSR to optimize high-throughput analysis of the two photosynthetic variables. Six tobacco genotypes, including both transgenic and wild-type lines, with a range of photosynthetic capacities were used to test the framework. Leaf reflectance spectra were measured from 400 to 2500 nm using a high-spectral-resolution spectroradiometer. Corresponding photosynthesis vs. intercellular CO2 concentration response curves were measured for each leaf using a leaf gas-exchange system. Results suggested that the mean R2 value of the six regression techniques for predicting Vc,max (Jmax) ranged from 0.60 (0.45) to 0.65 (0.56) with the mean RMSE value varying from 47.1 (40.1) to 54.0 (44.7) μmol m-2 s-1. Regression stacking for Vc,max (Jmax) performed better than the individual regression techniques with increases in R2 of 0.1 (0.08) and decreases in RMSE by 4.1 (6.6) μmol m-2 s-1, equal to 8% (15%) reduction in RMSE. Better predictive performance of the regression stacking is likely attributed to the varying coefficients (or weights) in the level-2 model (the LASSO model) and the diverse ability of each individual regression technique to utilize spectral information for the best modeling performance. Further refinements can be made to apply this stacked regression technique to other plant phenotypic traits.

[1]  I. Muchnik,et al.  Support Vector Machines for Classification , 2015 .

[2]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[3]  Carl J. Bernacchi,et al.  In vivo temperature response functions of parameters required to model RuBP-limited photosynthesis , 2003 .

[4]  R. Shah,et al.  Least Squares Support Vector Machines , 2022 .

[5]  T. Sharkey,et al.  Fitting photosynthetic carbon dioxide response curves for C(3) leaves. , 2007, Plant, cell & environment.

[6]  David Heckmann,et al.  Machine Learning Techniques for Predicting Crop Photosynthetic Capacity from Leaf Reflectance Spectra. , 2017, Molecular plant.

[7]  J. Berry,et al.  A biochemical model of photosynthetic CO2 assimilation in leaves of C3 species , 1980, Planta.

[8]  Yao Zhang,et al.  FluoSpec 2—An Automated Field Spectroscopy System to Monitor Canopy Solar-Induced Fluorescence , 2018, Sensors.

[9]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[10]  Daniel V. Samarov,et al.  The Spatial LASSO With Applications to Unmixing Hyperspectral Biomedical Images , 2015, Technometrics.

[11]  Zhe Zhu,et al.  Mapping forest change using stacked generalization: An ensemble approach , 2018 .

[12]  Claude A. Garcia,et al.  Ten principles for a landscape approach to reconciling agriculture, conservation, and other competing land uses , 2013, Proceedings of the National Academy of Sciences.

[13]  U. Rascher,et al.  Imaging plants dynamics in heterogenic environments. , 2012, Current opinion in biotechnology.

[14]  José Crossa,et al.  High-throughput phenotyping and genomic selection: the frontiers of crop breeding converge. , 2012, Journal of integrative plant biology.

[15]  M. Buchhorn,et al.  Relationships between hyperspectral data and components of vegetation biomass in Low Arctic tundra communities at Ivotuk, Alaska , 2017 .

[16]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[17]  C. Frankenberg,et al.  PhotoSpec: A new instrument to measure spatially distributed red and far-red Solar-Induced Chlorophyll Fluorescence , 2018, Remote Sensing of Environment.

[18]  Shawn P Serbin,et al.  Hyperspectral reflectance as a tool to measure biochemical and physiological traits in wheat , 2017, Journal of experimental botany.

[19]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[20]  M. Tester,et al.  Phenomics--technologies to relieve the phenotyping bottleneck. , 2011, Trends in plant science.

[21]  Mark G. M. Aarts,et al.  Natural genetic variation in plant photosynthesis. , 2011, Trends in plant science.

[22]  L. Plümer,et al.  Detection of early plant stress responses in hyperspectral images , 2014 .

[23]  R. Brereton,et al.  Support vector machines for classification and regression. , 2010, The Analyst.

[24]  Philip A. Townsend,et al.  Using leaf optical properties to detect ozone effects on foliar biochemistry , 2013, Photosynthesis Research.

[25]  Yunseop Kim,et al.  Hyperspectral image analysis for water stress detection of apple trees , 2011 .

[26]  E. Finkel Imaging. With 'phenomics,' plant scientists hope to shift breeding into overdrive. , 2009, Science.

[27]  E. Dwyer,et al.  Satellite remote sensing of grasslands: from observation to management—a review , 2016 .

[28]  Gustavo Camps-Valls,et al.  Gaussian processes uncertainty estimates in experimental Sentinel-2 LAI and leaf chlorophyll content retrieval , 2013 .

[29]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[30]  Habibollah Haron,et al.  Regression and ANN models for estimating minimum value of machining performance , 2012 .

[31]  J. R. Evans,et al.  Temperature response of carbon isotope discrimination and mesophyll conductance in tobacco. , 2013, Plant, cell & environment.

[32]  A. Leakey,et al.  High-Throughput Phenotyping of Maize Leaf Physiological and Biochemical Traits Using Hyperspectral Reflectance1[OPEN] , 2016, Plant Physiology.

[33]  S. Long,et al.  Can improvement in photosynthesis increase crop yields? , 2006, Plant, cell & environment.

[34]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[35]  S. Christensen,et al.  Plant phenomics and the need for physiological phenotyping across scales to narrow the genotype-to-phenotype knowledge gap. , 2015, Journal of experimental botany.

[36]  Jin Wu,et al.  High-throughput field phenotyping using hyperspectral reflectance and partial least squares regression (PLSR) reveals genetic modifications to photosynthetic capacity , 2019, Remote Sensing of Environment.

[37]  Michael J. Thomson,et al.  High-Throughput SNP Genotyping to Accelerate Crop Improvement , 2014 .

[38]  Anne-Katrin Mahlein,et al.  Recent advances in sensing plant diseases for precision crop protection , 2012, European Journal of Plant Pathology.

[39]  Daniel C. Ducat,et al.  Improving carbon fixation pathways. , 2012, Current opinion in chemical biology.

[40]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[41]  Susan L Ustin,et al.  Remote sensing of canopy chemistry , 2013, Proceedings of the National Academy of Sciences.

[42]  Philip A. Townsend,et al.  Leaf optical properties reflect variation in photosynthetic metabolism and its sensitivity to temperature , 2011, Journal of experimental botany.

[43]  R. Tibshirani,et al.  REJOINDER TO "LEAST ANGLE REGRESSION" BY EFRON ET AL. , 2004, math/0406474.

[44]  Carl J. Bernacchi,et al.  Improved temperature response functions for models of Rubisco‐limited photosynthesis , 2001 .

[45]  D. Slaughter,et al.  A NIR Technique for Rapid Determination of Soil Mineral Nitrogen , 1999, Precision Agriculture.

[46]  S. Long,et al.  What is the maximum efficiency with which photosynthesis can convert solar energy into biomass? , 2008, Current opinion in biotechnology.

[47]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[48]  T. Sharkey,et al.  What gas exchange data can tell us about photosynthesis. , 2015, Plant, cell & environment.

[49]  S. M. Hosseini,et al.  Estimation of thermophysical properties of dimethyl ether as a commercial refrigerant based on artificial neural networks , 2010, Expert Syst. Appl..

[50]  Jochen C Reif,et al.  Novel throughput phenotyping platforms in plant genetic studies. , 2007, Trends in plant science.

[51]  Wentao Bao,et al.  Group Lasso-Based Band Selection for Hyperspectral Image Classification , 2017, IEEE Geoscience and Remote Sensing Letters.

[52]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[53]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[54]  Jungho Im,et al.  ISPRS Journal of Photogrammetry and Remote Sensing , 2022 .

[55]  Qin Zhang,et al.  A Review of Imaging Techniques for Plant Phenotyping , 2014, Sensors.

[56]  Clayton C. Kingdon,et al.  Remotely estimating photosynthetic capacity, and its response to temperature, in vegetation canopies using imaging spectroscopy , 2015 .

[57]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[58]  Araceli Sanchis,et al.  Generating ensembles of heterogeneous classifiers using Stacked Generalization , 2015, WIREs Data Mining Knowl. Discov..

[59]  T. Andrews,et al.  Reduction of ribulose-1,5-bisphosphate carboxylase/oxygenase content by antisense RNA reduces photosynthesis in transgenic tobacco plants. , 1992, Plant physiology.

[60]  David M Kramer,et al.  Improving yield by exploiting mechanisms underlying natural variation of photosynthesis. , 2012, Current opinion in biotechnology.

[61]  Andrew M Mutka,et al.  Image-based phenotyping of plant disease symptoms , 2015, Front. Plant Sci..

[62]  Tai-Hoon Kim,et al.  Pattern Recognition Using Artificial Neural Network: A Review , 2010, ISA.

[63]  S. Shigeoka,et al.  Engineering Photosynthetic Pathways , 2008 .

[64]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[65]  P. J. Andralojc,et al.  Raising yield potential of wheat. II. Increasing photosynthetic capacity and efficiency. , 2011, Journal of experimental botany.

[66]  Vladimir Vapnik,et al.  Support-vector networks , 2004, Machine Learning.

[67]  Michael T. Manry,et al.  Attributes of neural networks for extracting continuous vegetation variables from optical and radar , 1998 .

[68]  P. Langridge,et al.  Breeding Technologies to Increase Crop Production in a Changing World , 2010, Science.

[69]  Qihao Weng,et al.  Consistent land surface temperature data generation from irregularly spaced Landsat imagery , 2016 .

[70]  Jose A. Jiménez-Berni,et al.  Proximal Remote Sensing Buggies and Potential Applications for Field-Based Phenotyping , 2014 .

[71]  Joshua S Yuan,et al.  Redesigning photosynthesis to sustainably meet global food and bioenergy demand , 2015, Proceedings of the National Academy of Sciences.

[72]  Luiz F. S. Coletta,et al.  Artificial Neural Network for Classification and Analysis of Degraded Soils , 2017 .

[73]  Hongbo Shao,et al.  Applying hyperspectral imaging to explore natural plant diversity towards improving salt stress tolerance. , 2017, The Science of the total environment.

[74]  Y. Hong,et al.  Precipitation Estimation from Remotely Sensed Imagery Using an Artificial Neural Network Cloud Classification System , 2004 .

[75]  Peng Gong,et al.  Geographic stacking: Decision fusion to increase global land cover map accuracy , 2015 .

[76]  L. Plümer,et al.  Original paper: Early detection and classification of plant diseases with Support Vector Machines based on hyperspectral reflectance , 2010 .

[77]  S. Long,et al.  Gas exchange measurements, what can they tell us about the underlying limitations to photosynthesis? Procedures and sources of error. , 2003, Journal of experimental botany.

[78]  O. Matsuda,et al.  Hyperspectral Imaging Techniques for Rapid Identification of Arabidopsis Mutants with Altered Leaf Pigment Status , 2012, Plant & cell physiology.

[79]  Luis Alonso,et al.  Machine learning regression algorithms for biophysical parameter retrieval: Opportunities for Sentinel-2 and -3 , 2012 .

[80]  Tracy Lawson,et al.  Multigene manipulation of photosynthetic carbon assimilation increases CO2 fixation and biomass yield in tobacco , 2015, Journal of experimental botany.

[81]  Philip Lewis,et al.  Hyperspectral remote sensing of foliar nitrogen content , 2012, Proceedings of the National Academy of Sciences.

[82]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.