Subagging for the improvement of predictive stability of extreme learning machine for spectral quantitative analysis of complex samples

Abstract Extreme learning machine (ELM) has been attracted increasing attentions for its fast learning speed and excellent generalization performance. However, the prediction result of a single ELM regression model is usually unstable due to the randomly generating of the input weights and hidden layer bias. To overcome this drawback, an ensemble form of ELM, termed as subagging ELM, was proposed and used for spectral quantitative analysis of complex samples. In the approach, a series of ELM sub-models was built by randomly selecting a certain number of samples from the original training set without replacement, and then the predictions of these sub-models were combined by a simple averaging way to give the final ensemble prediction. The performance of the method was tested with fuel oil and blood samples. Compared to a single ELM model, the results confirm that subagging ELM can achieve much better stability and higher accuracy than ELM.

[1]  L. Buydens,et al.  Multivariate calibration with least-squares support vector machines. , 2004, Analytical chemistry.

[2]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Gang Wang,et al.  An efficient hybrid kernel extreme learning machine approach for early diagnosis of Parkinson's disease , 2016, Neurocomputing.

[4]  W. Cai,et al.  An improved boosting partial least squares method for near-infrared spectroscopic quantitative analysis. , 2010, Analytica chimica acta.

[5]  Ewan W Blanch,et al.  Determination of Protein Secondary Structure from Infrared Spectra Using Partial Least-Squares Regression. , 2016, Biochemistry.

[6]  Yi Lu,et al.  Dissimilarity based ensemble of extreme learning machine for gene expression data classification , 2014, Neurocomputing.

[7]  J. Sanz-Justo,et al.  A novel Grouping Genetic Algorithm–Extreme Learning Machine approach for global solar radiation prediction from numerical weather models inputs , 2016 .

[8]  Xueguang Shao,et al.  Variable space boosting partial least squares for multivariate calibration of near-infrared spectroscopy ☆ , 2016 .

[9]  Fuli Wang,et al.  Neural network ensemble modeling for nosiheptide fermentation process based on partial least squares regression , 2011 .

[10]  Xueguang Shao,et al.  A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples. , 2007, Talanta.

[11]  Peter D. Wentzell,et al.  Estimation of hydrocarbon types in light gas oils and diesel fuels by ultraviolet absorption spectroscopy and multivariate calibration , 1999 .

[12]  Quansheng Chen,et al.  Qualitative and quantitative analysis in solid-state fermentation of protein feed by FT-NIR spectroscopy integrated with multivariate data analysis , 2013 .

[13]  Philip K. Hopke,et al.  Application of PLS and Back-Propagation Neural Networks for the estimation of soil properties , 2005 .

[14]  R. Wightman,et al.  Multivariate concentration determination using principal component regression with residual analysis. , 2009, Trends in analytical chemistry : TRAC.

[15]  Aranildo R. Lima,et al.  Forecasting daily streamflow using online sequential extreme learning machines , 2016 .

[16]  Roberto Kawakami Harrop Galvão,et al.  Effect of the subsampling ratio in the application of subagging for multivariate calibration with the successive projections algorithm , 2011 .

[17]  Maria José Valenzuela Bell,et al.  Quantification of whey in fluid milk using confocal Raman microscopy and artificial neural network. , 2015, Journal of dairy science.

[18]  Zhiqiang Ge,et al.  Ensemble independent component regression models and soft sensing application , 2014 .

[19]  Yuanyan Tang,et al.  Combination of activation functions in extreme learning machines for multivariate calibration , 2013 .

[20]  Dong-Sheng Cao,et al.  ADME Properties Evaluation in Drug Discovery: Prediction of Caco-2 Cell Permeability Using a Combination of NSGA-II and Boosting , 2016, J. Chem. Inf. Model..

[21]  Marcelo Nascimento Martins,et al.  An application of subagging for the improvement of prediction accuracy of multivariate calibration models , 2006 .

[22]  Romà Tauler,et al.  Multivariate calibration of first-order data with the correlation constrained MCR-ALS method , 2015 .

[23]  Shahaboddin Shamshirband,et al.  Extreme learning machine for prediction of heat load in district heating systems , 2016 .

[24]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[25]  Yi-Zeng Liang,et al.  Calibration transfer via an extreme learning machine auto-encoder. , 2016, The Analyst.

[26]  Shi-Miao Tan,et al.  Quantitative analysis of tea using ytterbium‐based internal standard near‐infrared spectroscopy coupled with boosting least‐squares support vector regression , 2013 .

[27]  Z. Wen,et al.  Simultaneous spectrophotometric quantification of dinitrobenzene isomers in water samples using multivariate calibration methods , 2016 .

[28]  Jian Jiao,et al.  A Robust Boosting Regression Tree with Applications in Quantitative Structure-Activity Relationship Studies of Organic Compounds , 2011, J. Chem. Inf. Model..

[29]  Karl H. Norris,et al.  Spectrophotometry of Human Hemoglobin in the near Infrared Region from 1000 to 2500 nm , 1994 .

[30]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[31]  Qing-Song Xu,et al.  The equivalence of partial least squares and principal component regression in the sufficient dimension reduction framework , 2016 .

[32]  Yan-Lin He,et al.  Soft sensor development for the key variables of complex chemical processes using a novel robust bagging nonlinear model integrating improved extreme learning machine with partial least square , 2016 .

[33]  R. Yager,et al.  A novel fusion approach based on induced ordered weighted averaging operators for chemometric data analysis , 2013 .

[34]  Ke Wang,et al.  Bagging for robust non-linear multivariate calibration of spectroscopy , 2011 .

[35]  D. Massart,et al.  Near-infrared spectroscopy applications in pharmaceutical analysis. , 2007, Talanta.

[36]  Minghe Sun,et al.  Behavior-aware user response modeling in social media: Learning from diverse heterogeneous data , 2015, Eur. J. Oper. Res..

[37]  Yankun Li,et al.  A consensus PLS method based on diverse wavelength variables models for analysis of near-infrared spectra , 2014 .

[38]  Xihui Bian,et al.  High and low frequency unfolded partial least squares regression based on empirical mode decomposition for quantitative analysis of fuel oil samples. , 2016, Analytica chimica acta.

[39]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[40]  Yong Yu,et al.  Sales forecasting using extreme learning machine with applications in fashion retailing , 2008, Decis. Support Syst..

[41]  Parviz Shahbazikhah,et al.  A consensus modeling approach to update a spectroscopic calibration , 2013 .

[42]  Yi Zhao,et al.  A protein secondary structure prediction framework based on the Extreme Learning Machine , 2008, Neurocomputing.

[43]  Wang Jiangjiang,et al.  Spectral quantitative analysis of complex samples based on the extreme learning machine , 2016 .