Determination of dry matter content of tea by near and middle infrared spectroscopy coupled with wavelet-based data mining algorithms

To explore the potential of near and middle infrared spectroscopy application in fast determination of dry matter content (DMC) of tea through the whole process from fresh tea leaf, semi-manufactured tea and to finished tea, samples from seven stages of the tea process were collect and the research was conducted based on data mining algorithms. Kubelka-Munk transform and spectral pre-treatment were adopted for elimination of disturbances caused by irregular appearance of intact tea in diffuse reflectance mode. A wavelet-based data mining algorithm composed of wavelet packet transform and statistical analysis (WPT-SA) was proposed to extract and optimize spectral feature from full-spectrum data. Another data mining algorithm of kernel principal component analysis (KPCA) was also employed for a performance comparison. Regression models were respectively established based on the full-spectrum data, wavelet spectral feature and kernel principal component. Statistical analysis revealed that the wavelet parameters (basis function and scale) were significant for these R^2 and RMSE of determination model and the optimization of wavelet parameters were vital for application of WPT. Modeling results showed that the regression model based on the wavelet spectral feature outperformed the other models, and the optimal regression model obtained a high R^2 of 0.9556, and a low root mean square error of 0.0501. These results indicate that it is feasible to measure DMC of tea in different processing procedures using near and middle infrared spectroscopy, and the proposed feature optimization algorithm (WPT-SA) is an effective data mining approach for enhancing the capability of spectral measurement.

[1]  Jui Jen Chou,et al.  Crop identification with wavelet packet analysis and weighted Bayesian distance , 2007 .

[2]  Xiaoli Li,et al.  Nondestructive measurement and fingerprint analysis of soluble solid content of tea soft drink based on Vis/NIR spectroscopy , 2007 .

[3]  Nahid Mashkouri Najafi,et al.  Determination of caffeine in black tea leaves by Fourier transform infrared spectrometry using multiple linear regression , 2003 .

[4]  Paul S Addison,et al.  Wavelet transforms and the ECG: a review , 2005, Physiological measurement.

[5]  Ronald R. Coifman,et al.  Entropy-based algorithms for best basis selection , 1992, IEEE Trans. Inf. Theory.

[6]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[7]  Reyes Artacho,et al.  Beneficial Effects of Green Tea—A Review , 2006, Journal of the American College of Nutrition.

[8]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[9]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[10]  Qin Zhang,et al.  Wavelet based multi-spectral image analysis of maize leaf chlorophyll content , 2007 .

[11]  Yong He,et al.  Discrimination of varieties of tea using near infrared spectroscopy by principal component analysis and BP model , 2007 .

[12]  Desire L. Massart,et al.  Feasibility study for the use of near infrared spectroscopy in the qualitative and quantitative analysis of green tea, Camellia sinensis (L.) , 2003 .

[13]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[14]  Yong He,et al.  Evaluation of Least Squares Support Vector Machine Regression and other Multivariate Calibrations in Determination of Internal Attributes of Tea Beverages , 2010 .

[15]  Miguel de la Guardia,et al.  Determination of caffeine in tea samples by Fourier transform infrared spectrometry , 2002, Analytical and bioanalytical chemistry.

[16]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[17]  Jianning Wu,et al.  Feature extraction via KPCA for classification of gait patterns. , 2007, Human movement science.

[18]  J. Shenk,et al.  Application of NIR Spectroscopy to Agricultural Products , 1992 .

[19]  Joachim Krieter,et al.  A note on using wavelet analysis for disease detection in lactating sows , 2011 .

[20]  J. Luypaert,et al.  Determination of total antioxidant capacity in green tea by near-infrared spectroscopy and multivariate calibration. , 2004, Talanta.

[21]  H. Schulz,et al.  Application of near-infrared reflectance spectroscopy to the simultaneous prediction of alkaloids and phenolic substances in green tea leaves. , 1999, Journal of agricultural and food chemistry.

[22]  S. Itoh,et al.  A wavelet transform-based ECG compression method guaranteeing desired signal quality , 1998, IEEE Transactions on Biomedical Engineering.

[23]  Yong He,et al.  Chlorophyll Assessment and Sensitive Wavelength Exploration for Tea (Camellia sinensis) Based on Reflectance Spectral Characteristics , 2008 .

[24]  Ritu Vijay,et al.  Image Quality Prediction by Minimum Entropy Calculation for Various Filter Banks , 2010 .