Genetic Algorithm Interval Partial Least Squares Regression Combined Successive Projections Algorithm for Variable Selection in Near-Infrared Quantitative Analysis of Pigment in Cucumber Leaves

Variable (or wavelength) selection plays an important role in the quantitative analysis of near-infrared (NIR) spectra. A method based on a genetic algorithm interval partial least squares regression (GAiPLS) combined successive projections algorithm (SPA) was proposed for variable selection in NIR spectroscopy. GAiPLS was used to select informative interval regions among the spectrum, and then SPA was employed to select the most informative variables and to minimize collinearity between those variables in the model. The performance of the proposed method was compared with the full-spectrum model, conventional interval partial least squares regression (iPLS), and backward interval partial least squares regression (BiPLS) for modeling the NIR data sets of pigments in cucumber leaf samples. The multiple linear regression (MLR) model was obtained with eight variables for chlorophylls and five variables for carotenoids selected by SPA. When the SPA model was applied to the prediction of the validation set, the correlation coefficients of the predicted value by MLR and the measured value for the validation data set (rp) of chlorophylls and carotenoids were 0.917 and 0.932, respectively. Results show that the proposed method was able to select important wavelengths from the NIR spectra and makes the prediction more robust and accurate in quantitative analysis.

[1]  Paul J. Gemperline,et al.  Wavelength selection and optimization of pattern recognition methods using the genetic algorithm , 2000 .

[2]  A. Wellburn,et al.  Determinations of total carotenoids and chlorophylls a and b of leaf extracts in different solvents , 1983 .

[3]  Hartwig Schulz,et al.  Identification and quantification of valuable plant substances by IR and Raman spectroscopy , 2007 .

[4]  Renfu Lu,et al.  Visible and near-infrared spectroscopy for nondestructive quality assessment of pickling cucumbers , 2007 .

[5]  Riccardo Leardi,et al.  Genetic algorithm-PLS as a tool for wavelength selection in spectral data sets , 2003 .

[6]  M. C. U. Araújo,et al.  The successive projections algorithm for variable selection in spectroscopic multicomponent analysis , 2001 .

[7]  R. Leardi,et al.  Variable selection for multivariate calibration using a genetic algorithm: prediction of additive concentrations in polymer films from Fourier transform-infrared spectral data , 2002 .

[8]  Jonas Johansson,et al.  Comparison of different variable selection methods conducted on NIR transmission measurements on intact tablets , 2003 .

[9]  Lutgarde M. C. Buydens,et al.  Improvement of PLS model transferability by robust wavelength selection , 1998 .

[10]  R. Baranski,et al.  Potential of NIR‐FT‐Raman spectroscopy in natural carotenoid analysis , 2005, Biopolymers.

[11]  Sreeja Parameswaran,et al.  Density functional theory based calculations of the vibrational properties of chlorophyll-a , 2007 .

[12]  Desire L. Massart,et al.  Genetic algorithms (GA) applied to the orthogonal projection approach (OPA) for variable selection , 2004 .

[13]  Yibin Ying,et al.  Discrimination of Pear Varieties Using Three Classification Methods Based on Near-Infrared Spectroscopy , 2007 .

[14]  Maria Fernanda Pimentel,et al.  Aspects of the successive projections algorithm for variable selection in multivariate calibration applied to plasma emission spectrometry , 2001 .

[15]  H. Edwards,et al.  Raman spectra of carotenoids in natural products. , 2003, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[16]  Renfu Lu,et al.  AN LCTF-BASED MULTISPECTRAL IMAGING SYSTEM FOR ESTIMATION OF APPLE FRUIT FIRMNESS: PART II. SELECTION OF OPTIMAL WAVELENGTHS AND DEVELOPMENT OF PREDICTION MODELS , 2006 .

[17]  Yuri A. Gritz,et al.  Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. , 2003, Journal of plant physiology.

[18]  B. Datt Remote Sensing of Chlorophyll a, Chlorophyll b, Chlorophyll a+b, and Total Carotenoid Content in Eucalyptus Leaves , 1998 .

[19]  D. M. Moss,et al.  Red edge spectral measurements from sugar maple leaves , 1993 .

[20]  D. Cozzolino,et al.  The prediction of total anthocyanin concentration in red-grape homogenates using visible-near-infrared spectroscopy and artificial neural networks. , 2007, Analytica chimica acta.

[21]  Michel Hachey,et al.  Application of SIMPLISMA purity function for variable selection in multivariate regression analysis: A case study of protein secondary structure determination from infrared spectra , 2007 .

[22]  S. Engelsen,et al.  Interval Partial Least-Squares Regression (iPLS): A Comparative Chemometric Study with an Example from Near-Infrared Spectroscopy , 2000 .

[23]  James Barber,et al.  Red edge measurements for remotely sensing plant chlorophyll content , 1983 .

[24]  A. V. Drozd,et al.  New Matlab software for wavelength selection , 2003 .

[25]  H. Sharma,et al.  Use of Near-Infrared Spectroscopy to Predict Potential Mushroom (Agaricus Bisporus) Yield of Phase II Compost , 2000 .

[26]  B. Datt,et al.  Visible/near infrared reflectance and chlorophyll content in Eucalyptus leaves , 1999 .

[27]  Søren Balling Engelsen,et al.  Multivariate near-infrared and Raman spectroscopic quantifications of the crystallinity of lactose in whey permeate powder. , 2005 .

[28]  Celio Pasquini,et al.  Classification of Brazilian soils by using LIBS and variable selection in the wavelet domain. , 2009, Analytica chimica acta.

[29]  H. Ramon,et al.  Near infrared reflectance spectroscopy as a tool for the in-line determination of the moisture concentration in extruded semolina pasta , 2007 .

[30]  Vincent Leemans,et al.  Selection of the most efficient wavelength bands for ‘Jonagold’ apple sorting , 2003 .

[31]  Yibin Ying,et al.  Comparison of the HPLC method and FT-NIR analysis for quantification of glucose, fructose, and sucrose in intact apple fruits. , 2006, Journal of agricultural and food chemistry.

[32]  Susan L. Rose-Pehrsson,et al.  Automated wavelength selection for spectroscopic fuel models by symmetrically contracting repeated unmoving window partial least squares , 2008 .

[33]  Jiewen Zhao,et al.  Selection of the efficient wavelength regions in FT-NIR spectroscopy for determination of SSC of ‘Fuji’ apple based on BiPLS and FiPLS models , 2007 .