A boosting extreme learning machine for near-infrared spectral quantitative analysis of diesel fuel and edible blend oil samples

Extreme learning machines (ELMs) have drawn increasing attention due to their characteristics of simple structure, high learning speed and excellent performance. However, a single ELM tends to low predictive accuracy and instability in dealing with quantitative analysis of complex samples. To further improve the predictive accuracy and stability of ELMs, a new quantitative model, called the boosting ELM is proposed. In this approach, a large number of ELM sub-models are sequentially built by selecting a certain number of samples from the original training set according to the distribution of the sampling weights, and then their predictions are aggregated using the weighted median. The activation function and the number of hidden nodes of ELM sub-models are determined simultaneously by the ratio of mean value and standard deviation of correlation coefficients (MSR). The performance of the proposed method is tested with diesel fuel and blended edible oil samples. Compared with partial least squares (PLS) and ELMs, our results demonstrate that the boosting ELM is an efficient ensemble model and has obvious superiorities in predictive accuracy and stability. Therefore, the proposed method may be an alternative for near-infrared (NIR) spectral quantitative analysis of complex samples.

[1]  Xueguang Shao,et al.  Multivariate calibration methods in near infrared spectroscopic analysis , 2010 .

[2]  Xueguang Shao,et al.  A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples. , 2007, Talanta.

[3]  Li Yan-kun,et al.  Determination of diesel cetane number by consensus modeling based on uninformative variable elimination , 2012 .

[4]  Roman M. Balabin,et al.  Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data. , 2011, The Analyst.

[5]  Wang Jiangjiang,et al.  Spectral quantitative analysis of complex samples based on the extreme learning machine , 2016 .

[6]  Jian-hui Jiang,et al.  Determination of Glucose in Plasma by Dry Film-based Fourier Transformed-Infrared Spectroscopy Coupled with Boosting Support Vector Regression , 2007, Analytical sciences : the international journal of the Japan Society for Analytical Chemistry.

[7]  W. Cai,et al.  An improved boosting partial least squares method for near-infrared spectroscopic quantitative analysis. , 2010, Analytica chimica acta.

[8]  Jian Jiao,et al.  A Robust Boosting Regression Tree with Applications in Quantitative Structure-Activity Relationship Studies of Organic Compounds , 2011, J. Chem. Inf. Model..

[9]  Guilherme De A. Barreto,et al.  A Robust Extreme Learning Machine for pattern classification with outliers , 2016, Neurocomputing.

[10]  Zhizhong Mao,et al.  The research on the modeling method of batch process based on OS-ELM-RMPLS , 2014 .

[11]  Yan-Lin He,et al.  Soft sensor development for the key variables of complex chemical processes using a novel robust bagging nonlinear model integrating improved extreme learning machine with partial least square , 2016 .

[12]  R. Yager,et al.  A novel fusion approach based on induced ordered weighted averaging operators for chemometric data analysis , 2013 .

[13]  Shi-Miao Tan,et al.  Quantitative analysis of tea using ytterbium‐based internal standard near‐infrared spectroscopy coupled with boosting least‐squares support vector regression , 2013 .

[14]  W. Cai,et al.  Water can be a probe for sensing glucose in aqueous solutions by temperature dependent near infrared spectra. , 2017, Analytica chimica acta.

[15]  Yi-Zeng Liang,et al.  Monte Carlo cross validation , 2001 .

[16]  R. Poppi,et al.  Prediction of the distillation temperatures of crude oils using ¹H NMR and support vector regression with estimated confidence intervals. , 2015, Talanta.

[17]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[18]  Dong-Sheng Cao,et al.  A new strategy to prevent over-fitting in partial least squares models based on model population analysis. , 2015, Analytica chimica acta.

[19]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[20]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[21]  Xueguang Shao,et al.  Variable space boosting partial least squares for multivariate calibration of near-infrared spectroscopy ☆ , 2016 .

[22]  R. Poppi,et al.  Simultaneous determination of hydrocarbon renewable diesel, biodiesel and petroleum diesel contents in diesel fuel blends using near infrared (NIR) spectroscopy and chemometrics. , 2013, The Analyst.

[23]  Yi Lu,et al.  Dissimilarity based ensemble of extreme learning machine for gene expression data classification , 2014, Neurocomputing.

[24]  Xizhao Wang,et al.  Performance improvement of classifier fusion for batch samples based on upper integral , 2015, Neural Networks.

[25]  Jianping Yin,et al.  Boosting weighted ELM for imbalanced learning , 2014, Neurocomputing.

[26]  Dong-Sheng Cao,et al.  The boosting: A new idea of building models , 2010 .

[27]  Lu Wang,et al.  Quality analysis, classification, and authentication of liquid foods by near-infrared spectroscopy: A review of recent research developments , 2017, Critical reviews in food science and nutrition.

[28]  Quansheng Chen,et al.  Qualitative and quantitative analysis in solid-state fermentation of protein feed by FT-NIR spectroscopy integrated with multivariate data analysis , 2013 .

[29]  Guang-Bin Huang,et al.  A Fast SVD-Hidden-nodes based Extreme Learning Machine for Large-Scale Data Analytics , 2016, Neural Networks.

[30]  Jiewen Zhao,et al.  Intelligent sensing sensory quality of Chinese rice wine using near infrared spectroscopy and nonlinear tools. , 2016, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[31]  Tong Wu,et al.  Improvement of spectral calibration for food analysis through multi-model fusion. , 2012, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[32]  Yvan Vander Heyden,et al.  Predictive-property-ranked variable reduction with final complexity adapted models in partial least squares modeling for multiple responses. , 2013, Analytical chemistry.

[33]  Jian-hui Jiang,et al.  QSAR study of angiotensin II antagonists using robust boosting partial least squares regression. , 2007, Analytica chimica acta.

[34]  Yuanyan Tang,et al.  Combination of activation functions in extreme learning machines for multivariate calibration , 2013 .

[35]  Xin Qin,et al.  An ensemble method based on a self-organizing map for near-infrared spectral calibration of complex beverage samples , 2008, Analytical and bioanalytical chemistry.

[36]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[37]  D. Cozzolino,et al.  Wet or dry? The effect of sample characteristics on the determination of soil properties by near infrared spectroscopy , 2016 .

[38]  L. A. Stone,et al.  Computer Aided Design of Experiments , 1969 .

[39]  Xihui Bian,et al.  High and low frequency unfolded partial least squares regression based on empirical mode decomposition for quantitative analysis of fuel oil samples. , 2016, Analytica chimica acta.

[40]  D L Massart,et al.  Boosting partial least squares. , 2005, Analytical chemistry.

[41]  A. Sakudo Near-infrared spectroscopy for medical applications: Current status and future perspectives. , 2016, Clinica chimica acta; international journal of clinical chemistry.

[42]  Yizeng Liang,et al.  Boosting in block variable subspaces: An approach of additive modeling for structure–activity relationship , 2016 .

[43]  Ozren Jović,et al.  Spectroscopic and Chemometric Analysis of Binary and Ternary Edible Oil Mixtures: Qualitative and Quantitative Study. , 2016, Analytical chemistry.

[44]  Wenxiu Pan,et al.  Simultaneous and Rapid Measurement of Main Compositions in Black Tea Infusion Using a Developed Spectroscopy System Combined with Multivariate Calibration , 2015, Food Analytical Methods.

[45]  Olusola O. Soyemi,et al.  Multivariate Analysis of Near-Infrared Spectra Using the G-Programming Language , 2000, J. Chem. Inf. Comput. Sci..

[46]  Yi-Zeng Liang,et al.  Calibration transfer via an extreme learning machine auto-encoder. , 2016, The Analyst.