Spectral quantitative analysis of complex samples based on the extreme learning machine

Multivariate calibrations, including linear and non-linear methods, have been widely used in the spectral quantitative analysis of complex samples. Despite their efficiency and few parameters involved, linear methods are inferior for nonlinear problems. Non-linear methods also have disadvantages such as the requirement of many parameters, time-consuming and easily relapses into local optima though the outstanding performance in nonlinearity. Thus, taking the advantages of both linear and non-linear methods, a novel algorithm called the extreme learning machine (ELM) is introduced. The efficiency and stability of this method are investigated first. Then, the optimal activation function and number of hidden layer nodes are determined by a newly defined parameter, which takes into account both the predictive accuracy and stability of the model. The predictive performance of ELM is compared with principal component regression (PCR), partial least squares (PLS), support vector regression (SVR) and back propagation artificial neural network (BP-ANN) by three near-infrared (NIR) spectral datasets of diesel fuel, a ternary mixture and blood. Results show that the efficiency of ELM is mainly affected by the number of nodes for a certain dataset. Despite some instability, ELM becomes stable close to the optimal parameters. Moreover, ELM has a better or comparable performance compared with its competitors in the spectral quantitative analysis of complex samples.

[1]  K. Norris,et al.  Measurement of Hemoglobin in Unlysed Blood by Near-Infrared Spectroscopy , 1994 .

[2]  V. D. Tsoukalas,et al.  Prediction of Occupational Risk in the Shipbuilding Industry Using Multivariable Linear Regression and Genetic Algorithm Analysis , 2016 .

[3]  C. Simmler,et al.  Universal quantitative NMR analysis of complex natural samples. , 2014, Current opinion in biotechnology.

[4]  F. Priego-Capote,et al.  Study of sample preparation for quantitative analysis of amino acids in human sweat by liquid chromatography-tandem mass spectrometry. , 2016, Talanta.

[5]  Z. Wen,et al.  Quantitative determination on binary-component polymer bonded explosives: A joint study of ultraviolet spectrophotometry and multivariate calibration methods , 2015 .

[6]  A. Islam,et al.  Scheduling the blended solution as industrial CO2 absorber in separation process by back-propagation artificial neural networks. , 2015, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[7]  L. Buydens,et al.  Determination of optimal support vector regression parameters by genetic algorithms and simplex optimization , 2005 .

[8]  Iyan E. Mulia,et al.  Real-time forecasting of near-field tsunami waveforms at coastal areas using a regularized extreme learning machine , 2016 .

[9]  Jeremy M. Shaver,et al.  Muscle oxygenation measurement in humans by noninvasive optical spectroscopy and Locally Weighted Regression. , 2013, Analytica chimica acta.

[10]  W. Cai,et al.  An improved boosting partial least squares method for near-infrared spectroscopic quantitative analysis. , 2010, Analytica chimica acta.

[11]  Qing-Song Xu,et al.  The equivalence of partial least squares and principal component regression in the sufficient dimension reduction framework , 2016 .

[12]  A K Smilde,et al.  Influence of temperature on vibrational spectra and consequences for the predictive ability of multivariate models. , 1998, Analytical chemistry.

[13]  Jingming Ning,et al.  Back Propagation-Artificial Neural Network Model for Prediction of the Quality of Tea Shoots through Selection of Relevant Near Infrared Spectral Data via Synergy Interval Partial Least Squares , 2013 .

[14]  Ainara Gredilla,et al.  Non-destructive Spectroscopy combined with chemometrics as a tool for Green Chemical Analysis of environmental samples: A review , 2016 .

[15]  Shi-Miao Tan,et al.  Quantitative analysis of tea using ytterbium‐based internal standard near‐infrared spectroscopy coupled with boosting least‐squares support vector regression , 2013 .

[16]  Olusola O. Soyemi,et al.  Multivariate Analysis of Near-Infrared Spectra Using the G-Programming Language , 2000, J. Chem. Inf. Comput. Sci..

[17]  Yankun Li,et al.  A consensus PLS method based on diverse wavelength variables models for analysis of near-infrared spectra , 2014 .

[18]  Xiu‐Ping Yan,et al.  Ratiometric Fluorescent Detection of Phosphate in Aqueous Solution Based on Near Infrared Fluorescent Silver Nanoclusters/Metal-Organic Shell Composite. , 2015, Analytical chemistry.

[19]  Yuanyan Tang,et al.  Combination of activation functions in extreme learning machines for multivariate calibration , 2013 .

[20]  Ke Wang,et al.  Bagging for robust non-linear multivariate calibration of spectroscopy , 2011 .

[21]  2D-DIGE proteome analysis on the platelet proteins of patients with major depression , 2014, Proteome Science.

[22]  J. Torrecilla,et al.  Algorithmic modeling of spectroscopic data to quantify binary mixtures of vinegars of different botanical origins , 2016 .

[23]  R. Poppi,et al.  Quantification of conventional and advanced biofuels contents in diesel fuel blends using near-infrared spectroscopy and multivariate calibration , 2016 .

[24]  Ling Tang,et al.  A novel decomposition ensemble model with extended extreme learning machine for crude oil price forecasting , 2016, Eng. Appl. Artif. Intell..

[25]  W. Cai,et al.  Feasibility for quantitative determination of deoxyribonucleic acid by using near-infrared diffuse reflectance spectroscopy. , 2012, Talanta.

[26]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[27]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[28]  Yvan Vander Heyden,et al.  Predictive-property-ranked variable reduction with final complexity adapted models in partial least squares modeling for multiple responses. , 2013, Analytical chemistry.

[29]  S. Yousefinejad,et al.  Chemometrics tools in QSAR/QSPR studies: A historical perspective , 2015 .

[30]  Fumin Shen,et al.  Image classification using local linear regression , 2014, Neural Computing and Applications.

[31]  Zhide Hu,et al.  Linear and non-linear modeling for the investigation of gas chromatography retention indices of alkylbenzenes on Cit.A-4, SE-30 and Carbowax 20M , 2001 .

[32]  Li Yan-kun,et al.  Determination of diesel cetane number by consensus modeling based on uninformative variable elimination , 2012 .

[33]  Yan Liu,et al.  Micro-analysis by near-infrared diffuse reflectance spectroscopy with chemometric methods. , 2013, The Analyst.

[34]  Xueguang Shao,et al.  Multivariate calibration methods in near infrared spectroscopic analysis , 2010 .

[35]  Xueguang Shao,et al.  A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples. , 2007, Talanta.

[36]  Xueguang Shao,et al.  Wavelet unfolded partial least squares for near-infrared spectral quantitative analysis of blood and tobacco powder samples. , 2011, The Analyst.

[37]  S. Wold,et al.  The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[38]  Ömer Faruk Ertuğrul,et al.  Forecasting electricity load by a novel recurrent extreme learning machines approach , 2016 .

[39]  Jian-Hui Jiang,et al.  Adaptive wavelet packet transform for support vector machine modeling as globally optimized by particle swarm optimization algorithm , 2015 .