Variable selection by modified IPW (iterative predictor weighting)-PLS (partial least squares) in continuous wavelet regression models.

Variable selection is often used to produce more robust and parsimonious regression models. But when they are applied directly to the raw near-infrared spectra, it is not easy to select appropriate variables because background and noise will often overshadow or overlap the absorption bands of analyte. In this work, a new hybrid algorithm based on the selection of the most informative variables in the continuous wavelet transform (CWT) domain is described. The strategy is a combination of CWT and a procedure of modified iterative predictor weighting-partial least square (mIPW-PLS). After elimination of the background and noise in NIR spectra by CWT, the mIPW-PLS approach is used to select the most informative CWT coefficients. With the selected CWT coefficients, a PLS model is built finally for prediction. It is indicated that the extraction of most important variables in the CWT domain can effectively avoid the interference of background and noise, and result in a high quality of regression model with a very small number of variables and fewer PLS components.

[1]  Alejandro C. Olivieri,et al.  Wavelength selection by net analyte signals calculated with multivariate factor-based hybrid linear analysis (HLA). A theoretical and experimental comparison with partial least-squares (PLS) , 1999 .

[2]  Lei Nie,et al.  Approximate Derivative Calculated by Using Continuous Wavelet Transform , 2002, J. Chem. Inf. Comput. Sci..

[3]  Chen Da,et al.  Elimination of interference information by a new hybrid algorithm for quantitative calibration of near infrared spectra. , 2003, The Analyst.

[4]  G. C. Marten,et al.  Near infrared reflectance spectroscopy evaluation of ruminal fermentation and cellulase digestion of diverse forages , 1988 .

[5]  Bhavik R. Bakshi,et al.  Multiscale analysis and modeling using wavelets , 1999 .

[6]  Tetsuo Iwata,et al.  Application of the Modified UVE-PLS Method for a Mid-Infrared Absorption Spectral Data Set of Water—Ethanol Mixtures , 2000 .

[7]  Maria Fernanda Pimentel,et al.  A solution to the wavelet transform optimization problem in multicomponent analysis , 2003 .

[8]  G. W. Small,et al.  Calibration standardization algorithm for partial least-squares regression: application to the determination of physiological levels of glucose by near-infrared spectroscopy. , 2002, Analytical chemistry.

[9]  L. Buydens,et al.  Development of robust calibration models in near infra-red spectrometric applications , 2000 .

[10]  Junbin Gao,et al.  A review on applications of wavelet transform techniques in chemical analysis: 1989–1997 , 1998 .

[11]  David L. Donoho,et al.  De-noising by soft-thresholding , 1995, IEEE Trans. Inf. Theory.

[12]  Yizeng Liang,et al.  Uniform design and its applications in chemistry and chemical engineering , 2001 .

[13]  Steven D. Brown,et al.  Robust Calibration with Respect to Background Variation , 2001 .

[14]  D. Massart,et al.  Elimination of uninformative variables for multivariate calibration. , 1996, Analytical chemistry.

[15]  L. Rodriguez-Saona,et al.  Rapid analysis of sugars in fruit juices by FT-NIR spectroscopy. , 2001, Carbohydrate research.

[16]  S. Rutan,et al.  Characterization of the sources of variation affecting near-infrared spectroscopy using chemometric methods. , 1998, Analytical chemistry.

[17]  M A Arnold,et al.  Near-infrared spectroscopic measurement of physiological glucose levels in variable matrices of protein and triglycerides. , 1996, Analytical chemistry.

[18]  C. Garcia-Jares,et al.  Application of multivariate calibration to the simultaneous routine determination of ethanol, glycerol, fructose, glucose and total residual sugars in botrytized-grape sweet wines by means of near-infrared reflectance spectroscopy , 1997 .

[19]  Alexander Kai-man Leung,et al.  Wavelet: a new trend in chemistry. , 2003, Accounts of chemical research.

[20]  Steven D. Brown,et al.  Wavelet analysis applied to removing non‐constant, varying spectroscopic background in multivariate calibration , 2002 .

[21]  D. Kell,et al.  Variable selection in wavelet regression models , 1998 .

[22]  D. Massart,et al.  Application of wavelet transform to extract the relevant component from spectral data for multivariate calibration. , 1997, Analytical chemistry.

[23]  Xueguang Shao,et al.  A general approach to derivative calculation using wavelet transform , 2003 .

[24]  Alejandro C. Olivieri,et al.  Wavelength Selection for Multivariate Calibration Using a Genetic Algorithm: A Novel Initialization Strategy , 2002, J. Chem. Inf. Comput. Sci..

[25]  X. Shao,et al.  A novel method to calculate the approximate derivative photoacoustic spectrum using continuous wavelet transform , 2000, Fresenius' journal of analytical chemistry.

[26]  M. Forina,et al.  Iterative predictor weighting (IPW) PLS: a technique for the elimination of useless predictors in regression problems , 1999 .

[27]  Israel Schechter,et al.  Correction for nonlinear fluctuating background in monovariable analytical systems , 1995 .