Chemometrics‐Based Approach to Feature Selection of Chromatographic Profiles and its Application to Search Active Fraction of Herbal Medicine

In our previous report (J Pharmaceut Biomed 56 (2011) 443–447), a support vector machine (SVM)‐based pharmacodynamic model was established for predicting active fractions of herbal medicines (HMs), where information contents embedded in the chromatograms of the fractions were represented with the peak areas. However, in this representation the global characteristics of the chromatograms were completely missed, which is definitely contrary to the global and holistic views in theories of HMs and undoubtedly reduce the success rate of this model. To deal with the challenge, two chemometrics methods, that is, minimum redundancy maximum relevance (mRMR) and particle swarm optimizer (PSO), were applied in this article for feature selection of the whole chromatograms, and the PSO was also used to tune the SVM parameters. As a case, a sample HM, that is, Xiangdan injection, was investigated. The predictive accuracy was fully evaluated and compared with those by other popular and reported methods. Furthermore, the confirmation on the independent predicting set exhibited that the predicted bioactivities were well consistent with the experimental values. The important potential application of the present model is to be extended to help search active fractions of other HMs.

[1]  Foo-Tim Chau,et al.  Chemical information of Chinese medicines: A challenge to chemist , 2006 .

[2]  Claus A. Andersson,et al.  Correlation optimized warping and dynamic time warping as preprocessing methods for chromatographic data , 2004 .

[3]  P. Eilers Parametric time warping. , 2004, Analytical chemistry.

[4]  Yi Wang,et al.  A causal relationship discovery-based approach to identifying active components of herbal medicine , 2006, Comput. Biol. Chem..

[5]  Lian‐Wen Qi,et al.  Analysis of Chinese herbal medicines with holistic approaches and integrated evaluation models , 2008 .

[6]  Wei-dong Zhang,et al.  Chemometrics-based approach to modeling quantitative composition-activity relationships for Radix Tinosporae , 2010, Interdisciplinary Sciences: Computational Life Sciences.

[7]  K. Chou,et al.  Recent progress in protein subcellular location prediction. , 2007, Analytical biochemistry.

[8]  Foo-Tim Chau,et al.  Recipe for uncovering the bioactive components in herbal medicine. , 2009, Analytical chemistry.

[9]  Y. Heyden,et al.  Prediction of total green tea antioxidant capacity from chromatograms by multivariate modeling. , 2005, Journal of chromatography. A.

[10]  R Yuan,et al.  Traditional Chinese medicine: an approach to scientific proof and clinical validation. , 2000, Pharmacology & therapeutics.

[11]  O. Kvalheim,et al.  Chromatographic profiling and multivariate analysis for screening and quantifying the contributions from individual components to the bioactive signature in natural products , 2011 .

[12]  Yi Wang,et al.  A Computational Approach to Botanical Drug Design by Modeling Quantitative Composition–activity Relationship , 2006, Chemical biology & drug design.

[13]  Yingjin Yuan,et al.  Identification of antitumor constituents in curcuminoids from Curcuma longa L. based on the composition-activity relationship. , 2012, Journal of pharmaceutical and biomedical analysis.

[14]  Tianhan Xue,et al.  Studying Traditional Chinese Medicine , 2003, Science.

[15]  S. Liang,et al.  Multiple information contents derived from the chromatograms and their application to the modeling of quantitative profile-efficacy relationship. , 2012, Analytica chimica acta.

[16]  Yi Wang,et al.  Discovering active compounds from mixture of natural products by data mining approach , 2008, Medical & Biological Engineering & Computing.

[17]  M Daszykowski,et al.  Robust partial least squares model for prediction of green tea antioxidant capacity from chromatograms. , 2007, Journal of chromatography. A.

[18]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[19]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  William J. Welsh,et al.  Fractal Fingerprinting of Chromatographic Profiles Based on Wavelet Analysis and Its Application To Characterize the Quality Grade of Medicinal Herbs , 2003, J. Chem. Inf. Comput. Sci..

[21]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[22]  Yanxiang Gao,et al.  Investigation into the antioxidant activity and chemical composition of alcoholic extracts from defatted marigold (Tagetes erecta L.) residue. , 2012, Fitoterapia.

[23]  E. Ong Chemical assay of glycyrrhizin in medicinal plants by pressurized liquid extraction (PLE) with capillary zone electrophoresis (CZE) , 2002 .

[24]  A. Harvey,et al.  Strategies for discovering drugs from previously unexplored natural products. , 2000, Drug discovery today.

[25]  S. Liang,et al.  A support vector machine based pharmacodynamic prediction model for searching active fraction and ingredients of herbal medicine: Naodesheng prescription as an example. , 2011, Journal of pharmaceutical and biomedical analysis.

[26]  Weida Tong,et al.  An Approach to Comparative Analysis of Chromatographic Fingerprints for Assuring the Quality of Botanical Drugs , 2003, J. Chem. Inf. Comput. Sci..

[27]  Peixiang Cai,et al.  Predicting protein structural class with pseudo-amino acid composition and support vector machine fusion network. , 2006, Analytical biochemistry.