Aspects of the successive projections algorithm for variable selection in multivariate calibration applied to plasma emission spectrometry

Abstract The successive projections algorithm (SPA) was recently proposed as a variable selection strategy to minimize collinearity problems in multivariate calibration. Although SPA has been successfully applied to UV–VIS spectrophotometric multicomponent analysis, no evidence of its ability to deal with variable sets with both high and low signal-to-noise ratios has been presented. This issue is addressed by the present work, which applies SPA to the simultaneous determination of Mn, Mo, Cr, Ni and Fe using a low-resolution plasma spectrometer/diode array detection system. This problem is of particular interest since strong interanalyte spectral interferences arise and regions with high and low signal intensity alternate in the spectra. Results show that multiple linear regression (MLR) on the wavelengths selected by SPA yields models with better prediction capabilities than principal component regression (PCR) and partial least squares (PLS) models. A standard genetic algorithm (GA) used for comparison yielded results similar to SPA for Mn, Cr and Fe, and better predictions for Mo and Ni. However, in all cases, the GA resulted in models less parsimonious than SPA. The average of the root mean square relative error of prediction (RMSREP) obtained for the five analytes was 1.4% for MLR–SPA, 1.0% for MLR–GA, 2.2% for PCR, and 2.1% for PLS. Since the computational time demanded by SPA grows with the square of the number of spectral variables, a pre-selection procedure based on the identification of emission peaks is proposed. This procedure decreased selection time by a factor of 20, without significantly degrading the results.

[1]  John H. Kalivas,et al.  Global optimization by simulated annealing with wavelength selection for ultraviolet-visible spectrophotometry , 1989 .

[2]  Maria Fernanda Pimentel,et al.  Simultaneous multielemental determination using a low-resolution inductively coupled plasma spectrometer/diode array detection system , 1997 .

[3]  Charles L. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.

[4]  Svante Wold,et al.  Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection , 1996 .

[5]  C. B. Lucasius,et al.  Genetic algorithms in wavelength selection: a comparative study , 1994 .

[6]  Satoshi Kawata,et al.  Optimal Wavelength Selection for Quantitative Analysis , 1986 .

[7]  R. Leardi Application of a genetic algorithm to feature selection under full validation conditions and to outlier detection , 1994 .

[8]  D. Massart,et al.  Elimination of uninformative variables for multivariate calibration. , 1996, Analytical chemistry.

[9]  John H. Kalivas,et al.  Cyclic subspace regression with analysis of wavelength-selection criteria , 1999 .

[10]  Alejandro C. Olivieri,et al.  Wavelength selection by net analyte signals calculated with multivariate factor-based hybrid linear analysis (HLA). A theoretical and experimental comparison with partial least-squares (PLS) , 1999 .

[11]  Desire L. Massart,et al.  Variable selection for neural networks in multivariate calibration , 1998 .

[12]  Maria Fernanda Pimentel,et al.  Conversion of a sequential inductively coupled plasma emission spectrometer into a multichannel simultaneous system using a photodiode array detector , 1998, The Journal of automatic chemistry.

[13]  R. Boggia,et al.  Genetic algorithms as a strategy for feature selection , 1992 .

[14]  D B Kell,et al.  Variable selection in discriminant partial least-squares analysis. , 1998, Analytical chemistry.

[15]  Riccardo Leardi,et al.  Genetic Algorithms as a Tool for Wavelength Selection in Multivariate Calibration , 1995 .

[16]  M. C. U. Araújo,et al.  The successive projections algorithm for variable selection in spectroscopic multicomponent analysis , 2001 .

[17]  D. Massart,et al.  Application of wavelet transform to extract the relevant component from spectral data for multivariate calibration. , 1997, Analytical chemistry.

[18]  Roberto Todeschini,et al.  Kohonen artificial neural networks as a tool for wavelength selection in multicomponent spectrofluorimetric PLS modelling: application to phenol, o-cresol, m-cresol and p-cresol mixtures , 1999 .

[19]  D. Kell,et al.  Variable selection in wavelet regression models , 1998 .

[20]  M. Forina,et al.  Iterative predictor weighting (IPW) PLS: a technique for the elimination of useless predictors in regression problems , 1999 .

[21]  Eric R. Ziegel,et al.  How to Run Mixture Experiments for Product Quality , 1990 .

[22]  Wolfhard Wegscheider,et al.  Spectrophotometric multicomponent analysis applied to trace metal determinations , 1985 .

[23]  C. Spiegelman,et al.  Theoretical Justification of Wavelength Selection in PLS Calibration:  Development of a New Algorithm. , 1998, Analytical Chemistry.

[24]  Israel Schechter,et al.  Wavelength Selection for Simultaneous Spectroscopic Analysis. Experimental and Theoretical Study , 1996 .