Discrimination of traditional herbal medicines based on terahertz spectroscopy

Abstract Terahertz (THz) spectroscopy was employed to develop an efficient and applicative way of discriminating traditional herbal medicines in this paper. Spectra of three different herbal medicines (Herba Solani Lyrati, Herba Solani Nigri and Herba Aristolochiae Mollissimae) were obtained in the range of 0.2–1.2 THz. Principal component analysis (PCA) was applied to reduce the dimensionality of original spectral information. Three classification algorithms, support vector machine (SVM), decision tree (DT), and random forest (RF) were used to discriminate the herbal medicines. Receiver operating characteristic (ROC) curve and area under the ROC curve (AUC) were combined with classification accuracy to evaluate the performances of the three classification algorithms. The PCA-RF method got the best ROC curve and AUC, and achieved a prediction accuracy of 99%. The experimental results indicate that THz spectroscopy combined with chemometric algorithms is an effective and rapid method for the discrimination of traditional herbal medicines.

[1]  C. Duce,et al.  Coaxial microwave assisted hydrodistillation of essential oils from five different herbs (lavender, rosemary, sage, fennel seeds and clove buds): Chemical composition and thermal analysis , 2016 .

[2]  W. Judd,et al.  Comparative Leaf Anatomy and Systematics in Dendrobium, Sections Aporum and Rhizobium (Orchidaceae) , 1997, International Journal of Plant Sciences.

[3]  Patrick Haffner,et al.  Support vector machines for histogram-based image classification , 1999, IEEE Trans. Neural Networks.

[4]  Yibin Ying,et al.  Discrimination of Transgenic Rice containing the Cry1Ab Protein using Terahertz Spectroscopy and Chemometrics , 2015, Scientific Reports.

[5]  Yuan Tian,et al.  Identification and comparative analysis of the major chemical constituents in the extracts of single fuzi herb and fuzi-gancao herb-pair by UFLC-IT-TOF/MS. , 2014, Chinese journal of natural medicines.

[6]  Tao Chen,et al.  Classification and recognition of transgenic product by terahertz spectroscopy and DSVM , 2014 .

[7]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[8]  Zhi Li,et al.  [Simultaneous quantitative determination of multicomponents in tablets based on terahertz time-domain spectroscopy]. , 2013, Guang pu xue yu guang pu fen xi = Guang pu.

[9]  Annabella Astorino,et al.  The Proximal Trajectory Algorithm in SVM Cross Validation , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Kenji Ikeda,et al.  Non-Destructive Evaluation Method of Pharmaceutical Tablet by Terahertz-Time-Domain Spectroscopy: Application to Sound-Alike Medicines , 2013 .

[11]  V. Rodriguez-Galiano,et al.  Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines , 2015 .

[12]  Daren Yu,et al.  SHORT-TERM SOLAR FLARE PREDICTION USING MULTIRESOLUTION PREDICTORS , 2010 .

[13]  Harris Drucker,et al.  Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.

[14]  Kirill I. Zaytsev,et al.  Medical diagnostics using terahertz pulsed spectroscopy , 2014 .

[15]  Andrew D. Burnett,et al.  Laser Feedback Interferometry as a Tool for Analysis of Granular Materials at Terahertz Frequencies: Towards Imaging and Identification of Plastic Explosives , 2016, Sensors.

[16]  J. Cosyns Aristolochic Acid and ‘Chinese Herbs Nephropathy’ , 2003, Drug safety.

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  Mario Chica-Olmo,et al.  An assessment of the effectiveness of a random forest classifier for land-cover classification , 2012 .

[19]  M. Simmons,et al.  Molecular analyses of the Chinese herb Leigongteng (Tripterygium wilfordii Hook.f.). , 2011, Phytochemistry.

[20]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[21]  S. Hayta,et al.  Morphological, karyological features and pollen morphology of endemic Ebenus haussknechtii Bornm. ex Hub.-Mor. from Turkey: A traditional medicinal herb , 2014 .

[22]  Luzia Gonçalves,et al.  ROC curve estimation: An overview , 2014 .

[23]  Norbert Palka,et al.  Identification of concealed materials, including explosives, by terahertz reflection spectroscopy , 2013 .

[24]  Zhi Li,et al.  Identification of biomolecules by terahertz spectroscopy and fuzzy pattern recognition. , 2013, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[25]  Li Yang,et al.  UPLC-MS based metabolomics study on Senecio scandens and S. vulgaris: an approach for the differentiation of two Senecio herbs with similar morphology but different toxicity , 2012, Metabolomics.

[26]  Masao Fuketa,et al.  Estimating sentence types in computer related new product bulletins using a decision tree , 2004, Inf. Sci..

[27]  Kun-Huang Chen,et al.  Applying particle swarm optimization-based decision tree classifier for cancer classification on gene expression data , 2014, Appl. Soft Comput..

[28]  Shanhong Xia,et al.  Discrimination of moldy wheat using terahertz imaging combined with multivariate classification , 2015 .

[29]  Xizhong Shen,et al.  Gas chromatography-mass spectrometry following pressurized hot water extraction and solid-phase microextraction for quantification of eucalyptol, camphor, and borneol in Chrysanthemum flowers. , 2007, Journal of separation science.