Revealing metabolite biomarkers for acupuncture treatment by linear programming based feature selection

BackgroundAcupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use.ResultsIn this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints.ConclusionsOur result demonstrates that metabolic profiling might be a promising method to investigate the molecular mechanism of acupuncture. Comparing with other existing methods, LPFS shows better performance to select a small set of key molecules. In addition, LPFS is a general methodology and can be applied to other high-dimensional data analysis, for example cancer genomics.

[1]  Ilya Levner Proteomic Pattern Recognition , 2004 .

[2]  E Holmes,et al.  NMR and pattern recognition studies on the time-related metabolic effects of alpha-naphthylisothiocyanate on liver, urine, and plasma in the rat: an integrative metabonomic approach. , 2001, Chemical research in toxicology.

[3]  Luonan Chen,et al.  Optimization meets systems biology , 2010, BMC Systems Biology.

[4]  B. Kramer,et al.  Trends in biomarker research for cancer detection. , 2001, The Lancet. Oncology.

[5]  Elaine Holmes,et al.  A metabonomic investigation of hepatotoxicity using diffusion-edited 1H NMR spectroscopy of blood serum. , 2003, The Analyst.

[6]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[7]  Trupti Joshi,et al.  Inferring gene regulatory networks from multiple microarray datasets , 2006, Bioinform..

[8]  E Holmes,et al.  Development of a model for classification of toxin‐induced lesions using 1H NMR spectroscopy of urine combined with pattern recognition , 1998, NMR in biomedicine.

[9]  I. Wilson,et al.  Physiological variation in metabolic phenotyping and functional genomic studies: use of orthogonal signal correction and PLS‐DA , 2002, FEBS letters.

[10]  Luonan Chen,et al.  Biomolecular Networks: Methods and Applications in Systems Biology , 2009 .

[11]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[12]  Sui Huang,et al.  Gene Expression Dynamics Inspector (GEDI): for integrative analysis of expression profiles , 2003, Bioinform..

[13]  Chen Chen,et al.  Identifying biomarkers for acupuncture treatment via an optimization model , 2011, 2011 IEEE International Conference on Systems Biology (ISB).

[14]  Liang Fan-rong Metabonomics and Pattern Recognition Study on the Specificity of Foot-Yangming Meridian Points , 2010 .

[15]  G. Dienel,et al.  Glucose and lactate metabolism during brain activation , 2001, Journal of neuroscience research.

[16]  Xuegong Zhang,et al.  Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data , 2006, BMC Bioinformatics.

[17]  Lawrence Carin,et al.  Joint Classifier and Feature Optimization for Comprehensive Cancer Diagnosis Using Gene Expression Data , 2004, J. Comput. Biol..

[18]  J. Lindon,et al.  NMR‐based metabonomic approaches for evaluating physiological influences on biofluid composition , 2005, NMR in biomedicine.

[19]  Henrik Antti,et al.  Application of orthogonal signal correction to minimise the effects of physical and biological variation in high resolution 1H NMR spectra of biofluids. , 2002, The Analyst.

[20]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[21]  Qin Chen,et al.  1H NMR-based metabonomic study on the metabolic changes in the plasma of patients with functional dyspepsia and the effect of acupuncture. , 2010, Journal of pharmaceutical and biomedical analysis.

[22]  Charles S. Johnson,et al.  An Improved Diffusion-Ordered Spectroscopy Experiment Incorporating Bipolar-Gradient Pulses , 1995 .

[23]  Alexander J. Hartemink,et al.  Finding Diagnostic Biomarkers in Proteomic Spectra , 2006, Pacific Symposium on Biocomputing.

[24]  Yong Wang,et al.  A Linear Programming Framework for Inferring Gene Regulatory Networks by Integrating Heterogeneous Data , 2010 .

[25]  X. Cui,et al.  Statistical tests for differential expression in cDNA microarray experiments , 2003, Genome Biology.

[26]  K. Briski,et al.  Lactate is a critical "sensed" variable in caudal hindbrain monitoring of CNS metabolic stasis. , 2005, American journal of physiology. Regulatory, integrative and comparative physiology.

[27]  F. Azuaje,et al.  Multiple SVM-RFE for gene selection in cancer classification with expression data , 2005, IEEE Transactions on NanoBioscience.

[28]  J. Sims,et al.  The mechanism of acupuncture analgesia: a review , 1997 .

[29]  Lawrence Carin,et al.  Sparse multinomial logistic regression: fast algorithms and generalization bounds , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  M. Castro,et al.  A metabolic switch in brain: glucose and lactate metabolism modulation by ascorbic acid , 2009, Journal of neurochemistry.

[31]  V. Routh,et al.  Differential effects of glucose and lactate on glucosensing neurons in the ventromedial hypothalamic nucleus. , 2005, Diabetes.

[32]  J. B. Rosen,et al.  Lower Dimensional Representation of Text Data Based on Centroids and Least Squares , 2003 .

[33]  Silvia Mangia,et al.  The in vivo neuron‐to‐astrocyte lactate shuttle in human brain: evidence from modeling of measured lactate levels during visual stimulation , 2009, Journal of neurochemistry.

[34]  U. Tan,et al.  THE MECHANISM OF ACUPUNCTURE AND CLINICAL APPLICATIONS , 2006, The International journal of neuroscience.

[35]  M. Cipolla,et al.  Mechanical signaling through connective tissue: a mechanism for the therapeutic effect of acupuncture , 2001, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[36]  T. Ebbels,et al.  Improved analysis of multivariate data by variable stability scaling: application to NMR-based metabolic profiling , 2003 .

[37]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[38]  Robert Tibshirani,et al.  A comparison of fold-change and the t-statistic for microarray data analysis , 2007 .