Variable selection by double competitive adaptive reweighted sampling for calibration transfer of near infrared spectra

Abstract A new variable selection method called double competitive adaptive reweighted sampling (Double CARS) was proposed for calibration transfer of near infrared spectra. In Double CARS, many variable subsets are generated by the calibration model of primary spectra. Then in each variable subset of primary spectra, many variable subsets of secondary spectra are also generated by the transfer set of secondary spectra. After that, the root mean square error of validation (RMSEV) of each paired variable subsets of primary and secondary spectra is computed and the paired ones with minimal RMSEV are selected as the best variable subsets. Three batches of spectra including corn, tablet and wheat datasets were tested by Double CARS. The results showed that in contrast to CARS solely selecting variables by primary spectra, the Double CARS can use the information of both primary and secondary spectra to select variables with lower errors. Moreover, the Double CARS can also execute variable selection for the paired spectra with different numbers of variables.

[1]  X. Liu,et al.  Ultimate analysis and heating value prediction of straw by near infrared spectroscopy. , 2009, Waste management.

[2]  R. Bro,et al.  Determination of the protein content in brine from salted herring using near-infrared spectroscopy , 2004 .

[3]  Germano Veras,et al.  Using near infrared spectroscopy to classify soybean oil according to expiration date. , 2016, Food chemistry.

[4]  H. Büning-Pfaue Analysis of water in food by near infrared spectroscopy , 2003 .

[5]  F. Marini,et al.  A rapid and non-invasive method for authenticating the origin of pistachio samples by NIR spectroscopy and chemometrics , 2013 .

[6]  Y. Ozaki,et al.  Short-wave near-infrared spectroscopy of biological fluids. 1. Quantitative analysis of fat, protein, and lactose in raw milk by partial least-squares regression and band assignment. , 2001, Analytical chemistry.

[7]  Yize Sun,et al.  Variables selection for quantitative determination of cotton content in textile blends by near infrared spectroscopy , 2016 .

[8]  Yan Song,et al.  Near-infrared spectroscopy quantitative determination of pefloxacin mesylate concentration in pharmaceuticals by using partial least squares and principal component regression multivariate calibration. , 2010, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[9]  M. Ferrão,et al.  A non-equidistant wavenumber interval selection approach for classifying diesel/biodiesel samples , 2017 .

[10]  Ludovic Duponchel,et al.  Increasing the spatial resolution of near infrared chemical images (NIR-CI): The super-resolution paradigm applied to pharmaceutical products , 2012 .

[11]  Bahram Hemmateenejad,et al.  Construction of stable multivariate calibration models using unsupervised segmented principal component regression , 2011 .

[12]  Dong-Sheng Cao,et al.  A bootstrapping soft shrinkage approach for variable selection in chemical modeling. , 2016, Analytica chimica acta.

[13]  Ronei J. Poppi,et al.  Quantitative analysis of piroxicam polymorphs pharmaceutical mixtures by hyperspectral imaging and chemometrics , 2011 .

[14]  K. Lima,et al.  Near-infrared spectroscopy and variable selection techniques to discriminate Pseudomonas aeruginosa strains in clinical samples , 2016 .

[15]  E. Gorritxategi,et al.  Optimization of the multivariate calibration of a Vis–NIR sensor for the on-line monitoring of marine diesel engine lubricating oil by variable selection methods , 2014 .

[16]  Yoshio Makino,et al.  Rapid and non-destructive detection of chicken adulteration in minced beef using visible near-infrared hyperspectral imaging and machine learning , 2016 .

[17]  J. S. Ribeiro,et al.  Chemometric models for the quantitative descriptive sensory analysis of Arabica coffee beverages using near infrared spectroscopy. , 2011, Talanta.

[18]  Fei Dai,et al.  Development of predictive models for total phenolics and free p-coumaric acid contents in barley grain by near-infrared spectroscopy. , 2017, Food chemistry.

[19]  Dong-Sheng Cao,et al.  Model population analysis for variable selection , 2010 .

[20]  Liming Yang,et al.  Recognition of the hardness of licorice seeds using a semi-supervised learning method and near-infrared spectral data , 2012 .

[21]  Qing-Song Xu,et al.  libPLS: An integrated library for partial least squares regression and linear discriminant analysis , 2018 .

[22]  Héctor C. Goicoechea,et al.  Representative subset selection and standardization techniques. A comparative study using NIR and a simulated fermentative process UV data , 2007 .

[23]  Hongdong Li,et al.  Identification of free fatty acids profiling of type 2 diabetes mellitus and exploring possible biomarkers by GC–MS coupled with chemometrics , 2010, Metabolomics.

[24]  Steven D. Brown,et al.  Transfer of multivariate calibration models: a review , 2002 .

[25]  B. Cuq,et al.  Application of two-dimensional cross-correlation spectroscopy to analyse infrared (MIR and NIR) spectra recorded during bread dough mixing , 2008 .

[26]  Marcelo Nascimento Martins,et al.  A comparative study of calibration transfer methods for determination of gasoline quality parameters in three different near infrared spectrometers. , 2008, Analytica chimica acta.

[27]  A. Salgó,et al.  Analysis of wheat grain development using NIR spectroscopy , 2012 .

[28]  H. Ramon,et al.  Comparison among principal component, partial least squares and back propagation neural network analyses for accuracy of measurement of selected soil properties with visible and near infrared spectroscopy , 2010 .

[29]  Haihua Wang,et al.  Determination of the moisture content of fresh meat using visible and near-infrared spatially resolved reflectance spectroscopy , 2017 .

[30]  J. Amigo,et al.  Unveiling multiple solid-state transitions in pharmaceutical solid dosage forms using multi-series hyperspectral imaging and different curve resolution approaches , 2017 .

[31]  Yi-Zeng Liang,et al.  Calibration transfer of near‐infrared spectra for extraction of informative components from spectra with canonical correlation analysis , 2014 .

[32]  Feng Qian,et al.  A novel adaptive algorithm with near-infrared spectroscopy and its application in online gasoline blending processes , 2015 .

[33]  Bin Li,et al.  Predicting soil arsenic pools by visible near infrared diffuse reflectance spectroscopy , 2017 .

[34]  Yi-Zeng Liang,et al.  Application of Competitive Adaptive Reweighted Sampling Method to Determine Effective Wavelengths for Prediction of Total Acid of Vinegar , 2012, Food Analytical Methods.

[35]  Kaiyi Zheng,et al.  Stability competitive adaptive reweighted sampling (SCARS) and its applications to multivariate calibration of NIR spectra , 2012 .

[36]  G. Downey,et al.  On the feasibility of near infrared spectroscopy to detect contaminants in water using single salt solutions as model systems. , 2015, Talanta.

[37]  Jerome J. Workman,et al.  Interpretive Spectroscopy for Near Infrared , 1996 .

[38]  Hongdong Li,et al.  Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration. , 2009, Analytica chimica acta.

[39]  Huanda Lu,et al.  Deep-learning-based regression model and hyperspectral imaging for rapid detection of nitrogen concentration in oilseed rape (Brassica napus L.) leaf , 2018 .

[40]  Anjum Munir,et al.  Hyperspectral imaging for the determination of potato slice moisture content and chromaticity during the convective hot air drying process , 2018 .

[41]  Eneko Gorritxategi,et al.  Low-cost visible-near infrared sensor for on-line monitoring of fat and fatty acids content during the manufacturing process of the milk. , 2012, Food chemistry.

[42]  M. C. Ferreira,et al.  Multivariate analysis of coconut residues by near infrared spectroscopy. , 2015, Talanta.