Comparison of Ensemble Strategies in Online NIR for Monitoring the Extraction Process of Pericarpium Citri Reticulatae Based on Different Variable Selections

Different ensemble strategies were compared in online near-infrared models for monitoring active pharmaceutical ingredients of Traditional Chinese Medicine. Bagging partial least square regression and boosting partial least square regression were adopted to near-infrared models, to determine hesperidin and nobiletin content during the extraction process of Pericarpium Citri Reticulatae in a pilot scale system. Different pretreatment methods were investigated, including Savitzky-Golay smoothing, derivatives, multiplicative scatter correction, standard normal variate, normalize, and combinations of them. Two different variable selection methods, including synergy interval partial least squares and backward interval partial least squares algorithms, were performed. Based on the result of the synergy interval partial least squares algorithm, bagging partial least square regression and boosting partial least square regression were adopted into the quantitative analysis. The results demonstrated that the established approach could be applied for rapid determination and real-time monitoring of hesperidin and nobiletin in Pericarpium Citri Reticulatae (Citrus reticulata) during the extraction process. Comparing the results, the boosting partial least square regression provided a slightly better accuracy than the bagging partial least square regression. Finally, this paper provides a promising ensemble strategy on online near-infrared models in Chinese medicine.

[1]  R. V. Rossel,et al.  Robust Modelling of Soil Diffuse Reflectance Spectra by “Bagging-Partial Least Squares Regression” , 2007 .

[2]  D L Massart,et al.  Boosting partial least squares. , 2005, Analytical chemistry.

[3]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[4]  Menglong Li,et al.  Determination of nicotine in tobacco samples by near-infrared spectroscopy and boosting partial least squares , 2010 .

[5]  Geoffrey I. Webb,et al.  MultiBoosting: A Technique for Combining Boosting and Wagging , 2000, Machine Learning.

[6]  Ping Li,et al.  Simultaneous determination of six bioactive flavonoids in Citri Reticulatae Pericarpium by rapid resolution liquid chromatography coupled with triple quadrupole electrospray tandem mass spectrometry. , 2013, Food chemistry.

[7]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[8]  Qun Ma,et al.  Multivariate detection limits of on-line NIR model for extraction process of chlorogenic acid from Lonicera japonica. , 2013, Journal of pharmaceutical and biomedical analysis.

[9]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[10]  Yi-Zeng Liang,et al.  Quality control and discrimination of pericarpium citri reticulatae and pericarpium citri reticulatae viride based on high-performance liquid chromatographic fingerprints and multivariate statistical analysis. , 2007, Analytica chimica acta.

[11]  Hui Zhang,et al.  Rapid determination of major bioactive isoflavonoid compounds during the extraction process of kudzu (Pueraria lobata) by near-infrared transmission spectroscopy. , 2015, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[12]  H. Büning-Pfaue Analysis of water in food by near infrared spectroscopy , 2003 .

[13]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[14]  Qian-xuan Zhang,et al.  A strategy of small sample modeling for multivariate regression based on improved Boosting PLS , 2012 .

[15]  P. Williams The RPD Statistic: A Tutorial Note , 2014 .

[16]  Kaishi Zhang,et al.  A simple ensemble strategy of uninformative variable elimination and partial least-squares for near-infrared spectroscopic calibration of pharmaceutical products , 2012 .

[17]  W. Fred McClure,et al.  204 Years of near Infrared Technology: 1800–2003 , 2003 .

[18]  Xueguang Shao,et al.  A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples. , 2007, Talanta.

[19]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[20]  Bing Xu,et al.  Validation of a NIR quantification method for the determination of chlorogenic acid in Lonicera japonica solution in ethanol precipitation process. , 2012, Journal of pharmaceutical and biomedical analysis.

[21]  Depo Yang,et al.  Simultaneous determination of five bioactive flavonoids in pericarpium Citri reticulatae from china by high-performance liquid chromatography with dual wavelength detection. , 2009, Journal of agricultural and food chemistry.

[22]  Bing Xu,et al.  Near-infrared for on-line determination of quality parameter of Sophora japonica L. (formula particles): From lab investigation to pilot-scale extraction process , 2015, Pharmacognosy magazine.