MetabR: an R script for linear model analysis of quantitative metabolomic data

BackgroundMetabolomics is an emerging high-throughput approach to systems biology, but data analysis tools are lacking compared to other systems level disciplines such as transcriptomics and proteomics. Metabolomic data analysis requires a normalization step to remove systematic effects of confounding variables on metabolite measurements. Current tools may not correctly normalize every metabolite when the relationships between each metabolite quantity and fixed-effect confounding variables are different, or for the effects of random-effect confounding variables. Linear mixed models, an established methodology in the microarray literature, offer a standardized and flexible approach for removing the effects of fixed- and random-effect confounding variables from metabolomic data.FindingsHere we present a simple menu-driven program, “MetabR”, designed to aid researchers with no programming background in statistical analysis of metabolomic data. Written in the open-source statistical programming language R, MetabR implements linear mixed models to normalize metabolomic data and analysis of variance (ANOVA) to test treatment differences. MetabR exports normalized data, checks statistical model assumptions, identifies differentially abundant metabolites, and produces output files to help with data interpretation. Example data are provided to illustrate normalization for common confounding variables and to demonstrate the utility of the MetabR program.ConclusionsWe developed MetabR as a simple and user-friendly tool for implementing linear mixed model-based normalization and statistical analysis of targeted metabolomic data, which helps to fill a lack of available data analysis tools in this field. The program, user guide, example data, and any future news or updates related to the program may be found athttp://metabr.r-forge.r-project.org/.

[1]  Joshua D Rabinowitz,et al.  Metabolomics in systems microbiology. , 2011, Current opinion in biotechnology.

[2]  M. Tomita,et al.  Pathway Projector: Web-Based Zoomable Pathway Browser Using KEGG Atlas and Google Maps API , 2009, PloS one.

[3]  Rainer Breitling,et al.  IDEOM: an Excel interface for analysis of LC-MS-based metabolomics data , 2012, Bioinform..

[4]  Douglas W Mahoney,et al.  Linear mixed effects models. , 2007, Methods in molecular biology.

[5]  J. Lindon,et al.  Metabonomics: a platform for studying drug toxicity and gene function , 2002, Nature Reviews Drug Discovery.

[6]  Hao Li,et al.  Analysis of oligonucleotide array experiments with repeated measures using mixed models , 2004, BMC Bioinformatics.

[7]  John D. Storey A direct approach to false discovery rates , 2002 .

[8]  A. Saxton A Macro for Converting Mean Separation Output to Letter Groupings in PROC MIXED , 1998 .

[9]  Joshua D Rabinowitz,et al.  Metabolomic analysis and visualization engine for LC-MS data. , 2010, Analytical chemistry.

[10]  Bing Zhang,et al.  An Integrated Approach for the Analysis of Biological Pathways using Mixed Models , 2008, PLoS genetics.

[11]  Lorenz Wernisch,et al.  Analysis of whole-genome microarray replicates using mixed models , 2003, Bioinform..

[12]  Walter T. Ambrosius,et al.  Topics in Biostatistics , 2007, Methods in Molecular Biology™.

[13]  David S. Wishart,et al.  MetaboAnalyst: a web server for metabolomic data analysis and interpretation , 2009, Nucleic Acids Res..

[14]  Martijn P. F. Berger,et al.  Optimal Designs for One- and Two-Color Microarrays Using Mixed Models: A Comparative Evaluation of Their Efficiencies , 2009, J. Comput. Biol..

[15]  Yulia R. Gel,et al.  lawstat: An R Package for Law, Public Policy and Biostatistics , 2008 .

[16]  Pan Du,et al.  lumi: a pipeline for processing Illumina microarray , 2008, Bioinform..

[17]  B. Weir,et al.  A systematic statistical linear modeling approach to oligonucleotide array experiments. , 2002, Mathematical biosciences.

[18]  Terence P. Speed,et al.  Normalization for cDNA microarry data , 2001, SPIE BiOS.

[19]  Wenyun Lu,et al.  Separation and quantitation of water soluble cellular metabolites by hydrophilic interaction chromatography-tandem mass spectrometry. , 2006, Journal of chromatography. A.

[20]  Ziv Shkedy,et al.  Using Linear Mixed Models for Normalization of cDNA Microarrays , 2007, Statistical applications in genetics and molecular biology.

[21]  J. Rabinowitz,et al.  Antifolate-induced depletion of intracellular glycine and purines inhibits thymineless death in E. coli. , 2010, ACS chemical biology.

[22]  Joshua D. Rabinowitz,et al.  Quorum Sensing Controls Biofilm Formation in Vibrio cholerae through Modulation of Cyclic Di-GMP Levels and Repression of vpsT , 2004, Journal of bacteriology.

[23]  Danhong Lu,et al.  Pancreatic β-Cell Death in Response to Pro-Inflammatory Cytokines Is Distinct from Genuine Apoptosis , 2011, PloS one.

[24]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[25]  Serge Rudaz,et al.  Knowledge discovery in metabolomics: an overview of MS data handling. , 2010, Journal of separation science.

[26]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[27]  S. Tesseraud,et al.  Insulin immuno-neutralization in chicken: effects on insulin signaling and gene expression in liver and muscle. , 2008, The Journal of endocrinology.

[28]  Tapabrata Maiti,et al.  Linear mixed model selection for false discovery rate control in microarray data analysis. , 2010, Biometrics.

[29]  E. Tai,et al.  Insulin resistance is associated with a metabolic profile of altered protein metabolism in Chinese and Asian-Indian men , 2010, Diabetologia.

[30]  Terry Speed,et al.  Normalization of cDNA microarray data. , 2003, Methods.

[31]  Pierre R. Bushel,et al.  Assessing Gene Significance from cDNA Microarray Expression Data via Mixed Models , 2001, J. Comput. Biol..

[32]  J. Dupont,et al.  Transcriptomic and metabolomic profiling of chicken adipose tissue in response to insulin neutralization and fasting , 2012, BMC Genomics.

[33]  Jay Snoddy,et al.  Gene expression profiling in human preadipocytes and adipocytes by microarray analysis. , 2004, The Journal of nutrition.