A variable differential consensus method for improving the quantitative near-infrared spectroscopic analysis

Consensus methods have presented promising tools for improving the reliability of quantitative models in near-infrared (NIR) spectroscopic analysis. A strategy for improving the performance of consensus methods in multivariate calibration of NIR spectra is proposed. In the approach, a subset of non-collinear variables is generated using successive projections algorithm (SPA) for each variable in the reduced spectra by uninformative variables elimination (UVE). Then sub-models are built using the variable subsets and the calibration subsets determined by Monte Carlo (MC) re-sampling, and the sub-model that produces minimal error in cross validation is selected as a member model. With repetition of the MC re-sampling, a series of member models are built and a consensus model is achieved by averaging all the member models. Since member models are built with the best variable subset and the randomly selected calibration subset, both the quality and the diversity of the member models are insured for the consensus model. Two NIR spectral datasets of tobacco lamina are used to investigate the proposed method. The superiority of the method in both accuracy and reliability is demonstrated.

[1]  Roberto Kawakami Harrop Galvão,et al.  A variable elimination method to improve the parsimony of MLR models using the successive projections algorithm , 2008 .

[2]  Fang Wang,et al.  A method for near-infrared spectral calibration of complex plant samples with wavelet transform and elimination of uninformative variables , 2004, Analytical and bioanalytical chemistry.

[3]  Steven D. Brown,et al.  Stacked partial least squares regression analysis for spectral calibration and prediction , 2009 .

[4]  R. Bro Multivariate calibration: What is in chemometrics for the analytical chemist? , 2003 .

[5]  Tom Fearn,et al.  On orthogonal signal correction , 2000 .

[6]  D. Massart,et al.  Elimination of uninformative variables for multivariate calibration. , 1996, Analytical chemistry.

[7]  Paul Geladi,et al.  Interactive variable selection (IVS) for PLS. Part II: Chemical applications , 1995 .

[8]  Xueguang Shao,et al.  A wavelength selection method based on randomization test for near-infrared spectral analysis , 2009 .

[9]  Paul Geladi,et al.  Interactive variable selection (IVS) for PLS. Part 1: Theory and algorithms , 1994 .

[10]  Gerra L. Bosco James L. Waters Symposium 2009 on near-infrared spectroscopy , 2010 .

[11]  W. Cai,et al.  A variable selection method based on uninformative variable elimination for multivariate calibration of near-infrared spectra , 2008 .

[12]  David J. Cummins,et al.  Iteratively reweighted partial least squares: A performance analysis by monte carlo simulation , 1995 .

[13]  P. Geladi,et al.  Linearization and Scatter-Correction for Near-Infrared Reflectance Spectra of Meat , 1985 .

[14]  S. Wold,et al.  Orthogonal signal correction of near-infrared spectra , 1998 .

[15]  D. Massart,et al.  The influence of data pre-processing in the pattern recognition of excipients near-infrared spectra. , 1999, Journal of Pharmaceutical and Biomedical Analysis.

[16]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[17]  Desire L. Massart,et al.  The robust normal variate transform for pattern recognition with near-infrared data , 1999 .

[18]  Jian-hui Jiang,et al.  QSAR study of angiotensin II antagonists using robust boosting partial least squares regression. , 2007, Analytica chimica acta.

[19]  Dong-Sheng Cao,et al.  The boosting: A new idea of building models , 2010 .

[20]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[21]  Alexander Kai-man Leung,et al.  Wavelet: a new trend in chemistry. , 2003, Accounts of chemical research.

[22]  A. Savitzky,et al.  Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .

[23]  Xueguang Shao,et al.  A weighted multiscale regression for multivariate calibration of near infrared spectra. , 2009, The Analyst.

[24]  Zhang Shi-zhi Partial least squares regression method based on consensus modeling for quantitative analysis of near infrared spectra , 2012 .

[25]  Yi-Zeng Liang,et al.  Monte Carlo cross validation , 2001 .

[26]  Xueguang Shao,et al.  A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples. , 2007, Talanta.

[27]  Dong Wang,et al.  Successive projections algorithm combined with uninformative variable elimination for spectral variable selection , 2008 .

[28]  W. Cai,et al.  An improved boosting partial least squares method for near-infrared spectroscopic quantitative analysis. , 2010, Analytica chimica acta.

[29]  Menglong Li,et al.  Subspace Regression Ensemble Method Based on Variable Clustering for Near-Infrared Spectroscopic Calibration , 2009 .

[30]  X. Shao,et al.  A background and noise elimination method for quantitative calibration of near infrared spectra , 2004 .

[31]  L. A. Stone,et al.  Computer Aided Design of Experiments , 1969 .

[32]  Peter Filzmoser,et al.  Partial robust M-regression , 2005 .

[33]  JOHN G. CARNEY,et al.  Tuning Diversity in Bagged Ensembles , 2000, Int. J. Neural Syst..