A dynamic probabilistic principal components model for the analysis of longitudinal metabolomics data

type="main" xml:id="rssc12060-abs-0001"> In a longitudinal metabolomics study, multiple metabolites are measured from several observations at many time points. Interest lies in reducing the dimensionality of such data and in highlighting influential metabolites which change over time. A dynamic probabilistic principal components analysis model is proposed to achieve dimension reduction while appropriately modelling the correlation due to repeated measurements. This is achieved by assuming an auto-regressive model for some of the model parameters. Linear mixed models are subsequently used to identify influential metabolites which change over time. The model proposed is used to analyse data from a longitudinal metabolomics animal study.

[1]  Isobel Claire Gormley,et al.  Probabilistic principal component analysis for metabolomic data , 2010, BMC Bioinformatics.

[2]  Philip W. Kuchel,et al.  Metabonomics Based on NMR Spectroscopy , 2004 .

[3]  B. Hammock,et al.  Mass spectrometry-based metabolomics. , 2007, Mass spectrometry reviews.

[4]  I. Jolliffe Principal Component Analysis , 2002 .

[5]  D. Higgins,et al.  Influence of acute phytochemical intake on human urinary metabolomic profiles. , 2007, The American journal of clinical nutrition.

[6]  J. Geweke,et al.  Measuring the pricing error of the arbitrage pricing theory , 1996 .

[7]  Paul D. McNicholas,et al.  Parsimonious Gaussian mixture models , 2008, Stat. Comput..

[8]  Yajun Mei,et al.  Linear-mixed effects models for feature selection in high-dimensional NMR spectra , 2009, Expert Syst. Appl..

[9]  Geert Postma,et al.  Feasibility of MR Metabolomics for Immediate Analysis of Resection Margins during Breast Cancer Surgery , 2013, PloS one.

[10]  A. Harvey,et al.  5 Stochastic volatility , 1996 .

[11]  N. Shephard,et al.  Stochastic Volatility: Likelihood Inference And Comparison With Arch Models , 1996 .

[12]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[13]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[14]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[15]  M. West,et al.  Bayesian Dynamic Factor Models and Portfolio Allocation , 2000 .

[16]  Shuhai Lin,et al.  Beyond glucose: metabolic shifts in responses to the effects of the oral glucose tolerance test and the high-fructose diet in rats. , 2011, Molecular bioSystems.

[17]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[18]  Susan Morgello,et al.  Plasma metabolomics identifies lipid abnormalities linked to markers of inflammation, microbial translocation, and hepatic function in HIV patients receiving protease inhibitors , 2013, BMC Infectious Diseases.

[19]  A. K. Smilde,et al.  Dynamic metabolomic data analysis: a tutorial review , 2009, Metabolomics.

[20]  Paola Sebastiani,et al.  Cluster analysis of gene expression dynamics , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[21]  S. Dudoit,et al.  Multiple Hypothesis Testing in Microarray Experiments , 2003 .

[22]  A. Smilde,et al.  Metabolic Profiling of the Response to an Oral Glucose Tolerance Test Detects Subtle Metabolic Changes , 2009, PloS one.

[23]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[24]  Fuwen Yang,et al.  Stochastic Dynamic Modeling of Short Gene Expression Time-Series Data , 2008, IEEE Transactions on NanoBioscience.

[25]  T. Louis,et al.  Bayes and Empirical Bayes Methods for Data Analysis. , 1997 .

[26]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[27]  Age K. Smilde,et al.  Crossfit analysis: a novel method to characterize the dynamics of induced plant responses , 2009, BMC Bioinformatics.

[28]  Tom Minka,et al.  Automatic Choice of Dimensionality for PCA , 2000, NIPS.

[29]  N. Shephard,et al.  Multivariate stochastic variance models , 1994 .

[30]  Jean-Marc Nuzillard,et al.  NMR metabolomics to revisit the tobacco mosaic virus infection in Nicotiana tabacum leaves. , 2006, Journal of natural products.

[31]  P. Groenen,et al.  Modern multidimensional scaling , 1996 .

[32]  J. Lindon,et al.  Longitudinal pharmacometabonomics for predicting patient responses to therapy: drug metabolism, toxicity and efficacy , 2012, Expert opinion on drug metabolism & toxicology.

[33]  N. Reo NMR-BASED METABOLOMICS , 2002, Drug and chemical toxicology.

[34]  Nial Friel,et al.  Estimating the evidence – a review , 2011, 1111.1957.

[35]  A. Rukhin Bayes and Empirical Bayes Methods for Data Analysis , 1997 .

[36]  Lorraine Brennan,et al.  Effects of pentylenetetrazole-induced seizures on metabolomic profiles of rat brain , 2010, Neurochemistry International.

[37]  Ron Wehrens,et al.  A targeted metabolomics approach to understand differences in flavonoid biosynthesis in red and yellow raspberries. , 2013, Plant physiology and biochemistry : PPB.

[38]  Fabian J Theis,et al.  The dynamic range of the human metabolome revealed by challenges , 2012, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[39]  Kåre I. Birkeland,et al.  Metabolic Changes in Urine during and after Pregnancy in a Large, Multiethnic Population-Based Cohort Study of Gestational Diabetes , 2012, PloS one.

[40]  Peter E. Rossi,et al.  Models and Priors for Multivariate Stochastic Volatility , 1995 .

[41]  Age K. Smilde,et al.  Analysis of longitudinal metabolomics data , 2004, Bioinform..

[42]  Fang-Xiang Wu,et al.  Dynamic Model-based Clustering for Time-course Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[43]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[44]  Age K. Smilde,et al.  ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data , 2005, Bioinform..

[45]  Peter E. Rossi,et al.  Bayesian Analysis of Stochastic Volatility Models , 1994 .

[46]  Giovanni Montana,et al.  A statistical framework for biomarker discovery in metabolomic time course data , 2011, Bioinform..

[47]  S. Wijmenga,et al.  NMR and pattern recognition methods in metabolomics: from data acquisition to biomarker discovery: a review. , 2012, Analytica chimica acta.

[48]  Kamel Jedidi,et al.  Heterogeneous factor analysis models: A bayesian approach , 2002 .