An adaptive strategy for selecting representative calibration samples in the continuous wavelet domain for near-infrared spectral analysis

Sample selection is often used to improve the cost-effectiveness of near-infrared (NIR) spectral analysis. When raw NIR spectra are used, however, it is not easy to select appropriate samples, because of background interference and noise. In this paper, a novel adaptive strategy based on selection of representative NIR spectra in the continuous wavelet transform (CWT) domain is described. After pretreatment with the CWT, an extension of the Kennard–Stone (EKS) algorithm was used to adaptively select the most representative NIR spectra, which were then submitted to expensive chemical measurement and multivariate calibration. With the samples selected, a PLS model was finally built for prediction. It is of great interest to find that selection of representative samples in the CWT domain, rather than raw spectra, not only effectively eliminates background interference and noise but also further reduces the number of samples required for a good calibration, resulting in a high-quality regression model that is similar to the model obtained by use of all the samples. The results indicate that the proposed method can effectively enhance the cost-effectiveness of NIR spectral analysis. The strategy proposed here can also be applied to different analytical data for multivariate calibration.

[1]  F. Xavier Rius,et al.  Constructing D-optimal designs from a list of candidate samples , 1997 .

[2]  W. Melssen,et al.  Selecting a representative training set for the classification of demolition waste using remote NIR sensing , 1999 .

[3]  Steven D. Brown,et al.  Robust Calibration with Respect to Background Variation , 2001 .

[4]  Chen Da,et al.  Elimination of interference information by a new hybrid algorithm for quantitative calibration of near infrared spectra. , 2003, The Analyst.

[5]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[6]  X. Shao,et al.  A background and noise elimination method for quantitative calibration of near infrared spectra , 2004 .

[7]  D. Massart,et al.  Standardization of near-infrared spectrometric instruments , 1996 .

[8]  Desire L. Massart,et al.  Artificial neural networks in classification of NIR spectral data: Design of the training set , 1996 .

[9]  Jie Lin,et al.  Near-IR Calibration Transfer between Different Temperatures , 1998 .

[10]  Alexander Kai-man Leung,et al.  Wavelet: a new trend in chemistry. , 2003, Accounts of chemical research.

[11]  Roberto Kawakami Harrop Galvão,et al.  A method for calibration and validation subset partitioning. , 2005, Talanta.

[12]  F. X. Rius,et al.  Assessing the validity of principal component regression models in different analytical conditions , 1997 .

[13]  K. Walsh,et al.  Short-Wavelength Near-Infrared Spectra of Sucrose, Glucose, and Fructose with Respect to Sugar Concentration and Temperature , 2003, Applied spectroscopy.

[14]  Tom Fearn,et al.  Comparing Standard Deviations , 1996 .

[15]  Desire L. Massart,et al.  Feasibility study for the use of near infrared spectroscopy in the qualitative and quantitative analysis of green tea, Camellia sinensis (L.) , 2003 .

[16]  Celio Pasquini,et al.  A strategy for selecting calibration samples for multivariate modelling , 2004 .

[17]  Celio Pasquini,et al.  Determination of total sulfur in diesel fuel employing NIR spectroscopy and multivariate calibration. , 2003, The Analyst.

[18]  A. Olivieri,et al.  Sustained prediction ability of net analyte preprocessing methods using reduced calibration sets. Theoretical and experimental study involving the spectrophotometric analysis of multicomponent mixtures. , 2001, The Analyst.

[19]  N. M. Faber,et al.  Uncertainty estimation and figures of merit for multivariate calibration (IUPAC Technical Report) , 2006 .

[20]  Alejandro C. Olivieri,et al.  Net Analyte Preprocessing: A New and Versatile Multivariate Calibration Technique. Analysis of Mixtures of Rubber Antioxidants by NIR Spectroscopy , 2001 .

[21]  Xueguang Shao,et al.  Continuous Wavelet Transform Applied to Removing the Fluctuating Background in Near-Infrared Spectra , 2004, J. Chem. Inf. Model..

[22]  Da Chen,et al.  A new hybrid strategy for constructing a robust calibration model for near-infrared spectral analysis , 2005, Analytical and bioanalytical chemistry.

[23]  F. Rius,et al.  Selection of the best calibration sample subset for multivariate regression. , 1996, Analytical chemistry.

[24]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[25]  Desire L. Massart,et al.  Representative subset selection , 2002 .

[26]  Xueguang Shao,et al.  Removal of major interference sources in aqueous near-infrared spectroscopy techniques , 2004, Analytical and bioanalytical chemistry.

[27]  C. Greensill,et al.  Sorting of Fruit Using near Infrared Spectroscopy: Application to a Range of Fruit and Vegetables for Soluble Solids and Dry Matter Content , 2004 .

[28]  P. A. Gorry General least-squares smoothing and differentiation by the convolution (Savitzky-Golay) method , 1990 .

[29]  G. Puchwein Selection of calibration samples for near-infrared spectrometry by factor analysis of spectra , 1988 .

[30]  X. Shao,et al.  A novel method to calculate the approximate derivative photoacoustic spectrum using continuous wavelet transform , 2000, Fresenius' journal of analytical chemistry.

[31]  D. L. Massart,et al.  Characterisation of the representativity of selected sets of samples in multivariate calibration and pattern recognition , 1997 .

[32]  Xueguang Shao,et al.  Variable selection by modified IPW (iterative predictor weighting)-PLS (partial least squares) in continuous wavelet regression models. , 2004, The Analyst.

[33]  L. A. Stone,et al.  Computer Aided Design of Experiments , 1969 .