A combination algorithm for variable selection to determine soluble solid content and firmness of pears

Informative variable (or wavelength) selection plays an important role in quantitative analysis by visible and near infrared (Vis/NIR) spectroscopy. In this study, a new combination of Monte Carlo-uninformative variable elimination (MC-UVE) and the successive projections algorithm (SPA) was proposed to select the most effective variables. The selected variables were used as the inputs of a least squares-support vector machine (LS-SVM) to build MC-UVE-SPA-LS-SVM models for determining the soluble solid content (SSC) and firmness of pears. Conventional PLS models were also developed for comparison. The results indicated that calibration models built using MC-UVE-SPA-LS-SVM on 14 and 17 effective variables achieved the optimum performance for two internal quality indices compared with full-spectrum PLS, MC-UVE-PLS, MC-UVE-LS-SVM and MC-UVE-SPA-PLS models by balancing model accuracy and model complexity. The correlation coefficient (r) and root mean square error of prediction (RMSEP) and residual predictive deviation (RPD) values for the prediction set were 0.9486, 0.3244, 3.1598 and 0.8955, 1.1077, 2.2469 for SSC and firmness, respectively. The overall results indicated that Vis/NIR spectroscopy incorporated with MC-UVE-SPA-LS-SVM could be applied as an alternative fast and accurate method for the nondestructive determination of the SSC and firmness of pears. The effective variables might be important for the development of portable instruments and online monitoring of the quality of pears.

[1]  S. Engelsen,et al.  Interval Partial Least-Squares Regression (iPLS): A Comparative Chemometric Study with an Example from Near-Infrared Spectroscopy , 2000 .

[2]  Yibin Ying,et al.  Variable selection in visible and near-infrared spectra: Application to on-line determination of sugar content in pears , 2012 .

[3]  T. Coen,et al.  Optimizing the tuning parameters of least squares support vector machines regression for NIR spectra , 2006 .

[4]  Roberto Kawakami Harrop Galvão,et al.  The successive projections algorithm for interval selection in PLS , 2013 .

[5]  John H. Kalivas,et al.  Global optimization by simulated annealing with wavelength selection for ultraviolet-visible spectrophotometry , 1989 .

[6]  Mohammad Hossein Fatemi,et al.  Application of a new SPA-SVM coupling method for QSPR study of electrophoretic mobilities of some organic and inorganic compounds , 2013 .

[7]  Luigi Ragni,et al.  Non-destructive internal quality assessment of “Hayward” kiwifruit by waveguide spectroscopy , 2012 .

[8]  Roberto Kawakami Harrop Galvão,et al.  A graphical user interface for variable selection employing the Successive Projections Algorithm , 2012 .

[9]  Pierre Dardenne,et al.  Validation and verification of regression in small data sets , 1998 .

[10]  Jitendra Paliwal,et al.  Near-infrared spectroscopy and imaging in food quality and safety , 2007 .

[11]  X. Shao,et al.  A background and noise elimination method for quantitative calibration of near infrared spectra , 2004 .

[12]  Yibin Ying,et al.  Nondestructive measurement of internal quality in pear using genetic algorithms and FT-NIR spectroscopy , 2008 .

[13]  Weixing Zhu,et al.  Determination of Pear Internal Quality Attributes by Fourier Transform Near Infrared (FT-NIR) Spectroscopy and Multivariate Analysis , 2013, Food Analytical Methods.

[14]  Dolores Pérez-Marín,et al.  Instantaneous quantitative and qualitative assessment of pear quality using near infrared spectroscopy , 2009 .

[15]  Yibin Ying,et al.  Experiments on predicting sugar content in apples by FT-NIR Technique ☆ , 2007 .

[16]  Jaroslaw Polanski,et al.  The Comparative Molecular Surface Analysis (CoMSA) with Modified Uniformative Variable Elimination-PLS (UVE-PLS) Method: Application to the Steroids Binding the Aromatase Enzyme , 2003, J. Chem. Inf. Comput. Sci..

[17]  Huirong Xu,et al.  Near infrared spectroscopy for on/in-line monitoring of quality in foods and beverages: A review , 2008 .

[18]  R. Clark,et al.  Spectroscopic Determination of Leaf Biochemistry Using Band-Depth Analysis of Absorption Features and Stepwise Multiple Linear Regression , 1999 .

[19]  Yidan Bao,et al.  Visible/Near-Infrared Spectra for Linear and Nonlinear Calibrations: A Case to Predict Soluble Solids Contents and pH Value in Peach , 2011 .

[20]  Pilar Rallo,et al.  Feasibility of NIR spectroscopy for non-destructive characterization of table olive traits , 2011 .

[21]  J. H. Perkins,et al.  Resolution enhancement using an approximate-inverse Savitzky-Golay smooth , 1988 .

[22]  Saeid Minaei,et al.  Reflectance Vis/NIR spectroscopy for nondestructive taste characterization of Valencia oranges , 2012 .

[23]  H. C. Bertram,et al.  Prediction of postharvest dry matter, soluble solids content, firmness and acidity in apples (cv. Elshof) using NMR and NIR spectroscopy: a comparative study , 2013, European Food Research and Technology.

[24]  José Antonio Cayuela,et al.  Prediction of quality of intact olives by near infrared spectroscopy. , 2010 .

[25]  D. Kell,et al.  Variable selection in wavelet regression models , 1998 .

[26]  Seyed Saeid Mohtasebi,et al.  Comparison Of Linear And Non-Linear Calibration Models For Non-Destructive Firmness Determining Of ‘Mazafati’ Date Fruit By Near Infrared Spectroscopy , 2014 .

[27]  Mahdi Ghasemi-Varnamkhasti,et al.  Screening analysis of beer ageing using near infrared spectroscopy and the Successive Projections Algorithm for variable selection. , 2012, Talanta.

[28]  Di Wu,et al.  Internal quality determination of fruit with bumpy surface using visible and near infrared spectroscopy and chemometrics: A case study with mulberry fruit , 2011 .

[29]  J. Koenderink Q… , 2014, Les noms officiels des communes de Wallonie, de Bruxelles-Capitale et de la communaute germanophone.

[30]  Shintaroh Ohashi,et al.  Nondestructive evaluation of jujube quality by visible and near-infrared spectroscopy , 2011 .

[31]  Ran Du,et al.  Determination of soluble solids and firmness of apples by Vis/NIR transmittance. , 2009 .

[32]  R. Barnes,et al.  Standard Normal Variate Transformation and De-Trending of Near-Infrared Diffuse Reflectance Spectra , 1989 .

[33]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[34]  Kerry B. Walsh,et al.  Assessment of sugar and starch in intact banana and mango fruit by SWNIR spectroscopy , 2011 .

[35]  Roberto Kawakami Harrop Galvão,et al.  The successive projections algorithm , 2013 .

[36]  M Daszykowski,et al.  Retention prediction of peptides based on uninformative variable elimination by partial least squares. , 2006, Journal of proteome research.

[37]  Ericka Stricklin-Parker,et al.  Ann , 2005 .

[38]  Yong He,et al.  CLASSIFYING THE SPECIES OF EXOPALAEMON BY USING VISIBLE AND NEAR INFRARED SPECTRA WITH UNINFORMATIVE VARIABLE ELIMINATION AND SUCCESSIVE PROJECTIONS ALGORITHM: CLASSIFYING THE SPECIES OF EXOPALAEMON BY USING VISIBLE AND NEAR INFRARED SPECTRA WITH UNINFORMATIVE VARIABLE ELIMINATION AND SUCCESSIVE PRO , 2010 .

[39]  Fei Liu,et al.  Comparison of calibrations for the determination of soluble solids content and pH of rice vinegars using visible and short-wave near infrared spectroscopy. , 2008, Analytica chimica acta.

[40]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[41]  T. Toyomasu,et al.  Effect of girdling above the abscission zone of fruit on 'Bartlett' pear ripening on the tree. , 2006, Journal of experimental botany.

[42]  P. Williams,et al.  Near-Infrared Technology in the Agricultural and Food Industries , 1987 .

[43]  Kimito Funatsu,et al.  Genetic algorithm‐based wavelength selection method for spectral calibration , 2011 .

[44]  Baohua Zhang,et al.  A comparative study for the quantitative determination of soluble solids content, pH and firmness of pears by Vis/NIR spectroscopy , 2013 .

[45]  M. Marseguerra,et al.  Normal and anomalous transport across an interface: Monte Carlo and analytical approach , 2006 .

[46]  L. Danielsson,et al.  Quantitative determination of content in binary powder mixtures using diffuse reflectance near infrared spectrometry and multivariate analysis , 2000 .

[47]  K. Moffett,et al.  Remote Sens , 2015 .

[48]  Di Wu,et al.  Uninformative variable elimination for improvement of successive projections algorithm on spectral multivariable selection with different calibration algorithms for the rapid and non-destructive determination of protein content in dried laver , 2011 .

[49]  M Gishen,et al.  Effect of temperature variation on the visible and near infrared spectra of wine and the consequences on the partial least square calibrations developed to measure chemical composition. , 2007, Analytica chimica acta.

[50]  M. C. U. Araújo,et al.  QSPR modeling of soil sorption coefficients (K(OC)) of pesticides using SPA-ANN and SPA-MLR. , 2009, Journal of agricultural and food chemistry.

[51]  Di Wu,et al.  Short-wave near-infrared spectroscopy analysis of major compounds in milk powder and wavelength assignment. , 2008, Analytica chimica acta.

[52]  Yidan Bao,et al.  Rapid prediction of moisture content of dehydrated prawns using online hyperspectral imaging system. , 2012, Analytica chimica acta.

[53]  Fan Zhang,et al.  Applying Near-Infrared Spectroscopy and Chemometrics to Determine Total Amino Acids in Herbicide-Stressed Oilseed Rape Leaves , 2011 .

[54]  Huirong Xu,et al.  Effect of fruit moving speed on predicting soluble solids content of ‘Cuiguan’ pears (Pomaceae pyrifolia Nakai cv. Cuiguan) using PLS and LS-SVM regression , 2009 .

[55]  Pramodita Sharma 2012 , 2013, Les 25 ans de l’OMC: Une rétrospective en photos.

[56]  A. Peirs,et al.  Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: A review , 2007 .

[57]  D. Massart,et al.  Elimination of uninformative variables for multivariate calibration. , 1996, Analytical chemistry.

[58]  Andrew G. Glen,et al.  APPL , 2001 .

[59]  Riccardo Leardi,et al.  Genetic Algorithms as a Tool for Wavelength Selection in Multivariate Calibration , 1995 .

[60]  Fei Liu,et al.  Application of successive projections algorithm for variable selection to determine organic acids of plum vinegar. , 2009 .

[61]  Dong Wang,et al.  Successive projections algorithm combined with uninformative variable elimination for spectral variable selection , 2008 .

[62]  S. Kawano,et al.  Development of a Calibration Equation with Temperature Compensation for Determining the Brix Value in Intact Peaches , 1995 .

[63]  María S. Di Nezio,et al.  Successive projections algorithm improving the multivariate simultaneous direct spectrophotometric determination of five phenolic compounds in sea water , 2007 .

[64]  Roberto Kawakami Harrop Galvão,et al.  Near infrared reflectance spectrometry classification of cigarettes using the successive projections algorithm for variable selection. , 2009, Talanta.

[65]  C. Camps,et al.  Non-destructive assessment of apricot fruit quality by portable visible-near infrared spectroscopy , 2009 .

[66]  Nasser Goudarzi,et al.  Application of successive projections algorithm (SPA) as a variable selection in a QSPR study to predict the octanol/water partition coefficients (Kow) of some halogenated organic compounds , 2010 .

[67]  R. Shah,et al.  Least Squares Support Vector Machines , 2022 .

[68]  M. C. U. Araújo,et al.  The successive projections algorithm for variable selection in spectroscopic multicomponent analysis , 2001 .

[69]  Miguel de la Guardia,et al.  Artificial neural network for quantitative determination of total protein in yogurt by infrared spectrometry. , 2009 .

[70]  Won Suk Lee,et al.  DETERMINATION OF SIGNIFICANT WAVELENGTHS AND PREDICTION OF NITROGEN CONTENT FOR CITRUS , 2005 .

[71]  Elaine Lanza,et al.  Application for Near Infrared Spectroscopy for Predicting the Sugar Content of Fruit Juices , 1984 .

[72]  W. Cai,et al.  A variable selection method based on uninformative variable elimination for multivariate calibration of near-infrared spectra , 2008 .

[73]  R. Dennis Cook,et al.  Cross-Validation of Regression Models , 1984 .

[74]  Aiguo Ouyang,et al.  Nondestructive measurement of soluble solid content of navel orange fruit by visible-NIR spectrometric technique with PLSR and PCA-BPNN. , 2010 .

[75]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[76]  L Wang,et al.  MCML--Monte Carlo modeling of light transport in multi-layered tissues. , 1995, Computer methods and programs in biomedicine.

[77]  Di Wu,et al.  Soluble solids content and pH prediction and varieties discrimination of grapes based on visible-near infrared spectroscopy , 2010 .

[78]  Yong He,et al.  Theory and application of near infrared reflectance spectroscopy in determination of food quality , 2007 .

[79]  Roman M. Balabin,et al.  Variable selection in near-infrared spectroscopy: benchmarking of feature selection methods on biodiesel data. , 2011, Analytica chimica acta.

[80]  Shyam Narayan Jha,et al.  Non-destructive prediction of quality of intact banana using spectroscopy , 2012 .

[81]  Y. Ozaki,et al.  Kernel Analysis of Partial Least Squares (PLS) Regression Models , 2011, Applied spectroscopy.