Selective paired ion contrast analysis: a novel algorithm for analyzing postprocessed LC-MS metabolomics data possessing high experimental noise.

One of the consequences in analyzing biological data from noisy sources, such as human subjects, is the sheer variability of experimentally irrelevant factors that cannot be controlled for. This holds true especially in metabolomics, the global study of small molecules in a particular system. While metabolomics can offer deep quantitative insight into the metabolome via easy-to-acquire biofluid samples such as urine and blood, the aforementioned confounding factors can easily overwhelm attempts to extract relevant information. This can mar potentially crucial applications such as biomarker discovery. As such, a new algorithm, called Selective Paired Ion Contrast (SPICA), has been developed with the intent of extracting potentially biologically relevant information from the noisiest of metabolomic data sets. The basic idea of SPICA is built upon redefining the fundamental unit of statistical analysis. Whereas the vast majority of algorithms analyze metabolomics data on a single-ion basis, SPICA relies on analyzing ion-pairs. A standard metabolomic data set is reinterpreted by exhaustively considering all possible ion-pair combinations. Statistical comparisons between sample groups are made only by analyzing the differences in these pairs, which may be crucial in situations where no single metabolite can be used for normalization. With SPICA, human urine data sets from patients undergoing total body irradiation (TBI) and from a colorectal cancer (CRC) relapse study were analyzed in a statistically rigorous manner not possible with conventional methods. In the TBI study, 3530 statistically significant ion-pairs were identified, from which numerous putative radiation specific metabolite-pair biomarkers that mapped to potentially perturbed metabolic pathways were elucidated. In the CRC study, SPICA identified 6461 statistically significant ion-pairs, several of which putatively mapped to folic acid biosynthesis, a key pathway in colorectal cancer. Utilizing support vector machines (SVMs), SPICA was also able to unequivocally outperform binary classifiers built from classical single-ion feature based SVMs.

[1]  Nicolas Pinto,et al.  PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation , 2009, Parallel Comput..

[2]  R. Doll,et al.  Cancer risks attributable to low doses of ionizing radiation: Assessing what we really know , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Tytus D. Mak,et al.  MetaboLyzer: a novel statistical workflow for analyzing Postprocessed LC-MS metabolomics data. , 2014, Analytical chemistry.

[4]  Bennett Daviss,et al.  Growing pains for metabolomics: the newest 'omic science is producing results--and more data than researchers know what to do with , 2005 .

[5]  Deok-Ho Kim,et al.  Enhanced chondrogenic differentiation of dental pulp stem cells using nanopatterned PEG-GelMA-HA hydrogels. , 2014, Tissue engineering. Part A.

[6]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[7]  Mark R. Viant,et al.  Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline , 2011, Metabolomics.

[8]  A. Roddam,et al.  Folate intake and colorectal cancer risk: A meta‐analytical approach , 2005, International journal of cancer.

[9]  C. Radzikowski,et al.  GABA content and GAD activity in colon tumors taken from patients with colon cancer or from xenografted human colon cancer cells growing as s.c. tumors in athymic nu/nu mice. , 1998, Journal of physiology and pharmacology : an official journal of the Polish Physiological Society.

[10]  Marina Evangelou,et al.  Comparison of Methods for Competitive Tests of Pathway Analysis , 2012, PloS one.

[11]  Ying Zhang,et al.  HMDB: the Human Metabolome Database , 2007, Nucleic Acids Res..

[12]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..

[13]  D. Hicks,et al.  Apoptosis in colorectal tumour cells: Induction by the short chain fatty acids butyrate, propionate and acetate and by the bile salt deoxycholate , 1995, International journal of cancer.

[14]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[15]  S. Madhavan,et al.  Genome-wide multi-omics profiling of colorectal cancer identifies immune determinants strongly associated with relapse , 2013, Front. Genet..

[16]  Oliver Fiehn,et al.  Mass-spectrometry-based metabolomics: limitations and recommendations for future progress with particular focus on nutrition research , 2009, Metabolomics.

[17]  J. Shaffer Multiple Hypothesis Testing , 1995 .

[18]  Schaich Km Free radical initiation in proteins and amino acids by ionizing and ultraviolet radiations and lipid oxidation — part III: Free radical transfer from oxidizing lipids , 1980 .

[19]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[20]  A. Zijno,et al.  Effects of folic acid deficiency and MTHFR C677T polymorphism on spontaneous and radiation-induced micronuclei in human lymphocytes. , 2006, Mutagenesis.

[21]  B. Berra,et al.  Dietary sphingolipids in colorectal cancer prevention. , 2002, European journal of cancer prevention : the official journal of the European Cancer Prevention Organisation.

[22]  V. Batra,et al.  Interaction between γ-radiation and dietary folate starvation metabolically reprograms global hepatic histone H3 methylation at lysine 4 and lysine 27 residues. , 2012, Food and chemical toxicology : an international journal published for the British Industrial Biological Research Association.

[23]  D. Fuchs,et al.  Serum tryptophan decrease correlates with immune activation and impaired quality of life in colorectal cancer , 2002, British Journal of Cancer.

[24]  David S. Wishart,et al.  MetaboAnalyst 2.0—a comprehensive server for metabolomic data analysis , 2012, Nucleic Acids Res..

[25]  Stephen J O'Brien,et al.  Accounting for multiple comparisons in a genome-wide association study (GWAS) , 2010, BMC Genomics.

[26]  Suzanne L Wolden,et al.  Development of a Metabolomic Radiation Signature in Urine from Patients Undergoing Total Body Irradiation , 2014, Radiation research.

[27]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..