Analysis of multi-source metabolomic data using joint and individual variation explained (JIVE).

Metabolic profiling is increasingly being used for understanding biological processes but there is no single analytical technique that provides a complete quantitative or qualitative profiling of the metabolome. Data fusion (i.e. joint analysis of data from multiple sources) has the potential to circumvent this issue facilitating knowledge discovery and reliable biomarker identification. Another field of application of data fusion is the simultaneous analysis of metabolomic changes through several biofluids or tissues. However, metabolomics typically deals with large datasets, with hundreds to thousands of variables and the identification of shared and individual factors or structures across multiple sources is challenging due to the high variable to sample ratios and differences in intensity and noise range. In this work we apply a recent method, Joint and Individual Variation Explained (JIVE), for the integrated unsupervised analysis of metabolomic profiles from multiple data sources. This method separates the shared patterns among data sources (i.e. the joint structure) from the individual structure of each data source that is unrelated to the joint structure. Two examples are described to show the applicability of JIVE for the simultaneous analysis of multi-source data using: (i) plasma samples subjected to different analytical techniques, sample treatment and measurement conditions; and (ii) plasma and urine samples subjected to liquid chromatography-mass spectrometry measured using two ionization conditions.

[1]  C. Barbas,et al.  Multiplatform analytical methodology for metabolic fingerprinting of lung tissue. , 2013, Analytical chemistry.

[2]  M. Rantalainen,et al.  Statistically integrated metabonomic-proteomic studies on a human prostate cancer xenograft model in mice. , 2006, Journal of proteome research.

[3]  Morten Arendt Rasmussen,et al.  A primer to nutritional metabolomics by NMR spectroscopy and chemometrics , 2013 .

[4]  Svante Wold,et al.  Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection , 1996 .

[5]  Age K. Smilde,et al.  ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data , 2005, Bioinform..

[6]  Thomas Hankemeier,et al.  The influence of citrate, EDTA, and heparin anticoagulants to human plasma LC–MS lipidomic profiling , 2012, Metabolomics.

[7]  A. Smilde,et al.  Fusion of mass spectrometry-based metabolomics data. , 2005, Analytical chemistry.

[8]  Malgorzata Baranska,et al.  An effect of anticoagulants on the FTIR spectral profile of mice plasma , 2013 .

[9]  David I. Ellis,et al.  Metabolic fingerprinting in disease diagnosis: biomedical applications of infrared and Raman spectroscopy. , 2006, The Analyst.

[10]  Bernhard Lendl,et al.  Infrared biospectroscopy for a fast qualitative evaluation of sample preparation in metabolomics. , 2014, Talanta.

[11]  Robert S Plumb,et al.  Statistical heterospectroscopy, an approach to the integrated analysis of NMR and UPLC-MS data sets: application in metabonomic toxicology studies. , 2006, Analytical chemistry.

[12]  Tormod Næs,et al.  Chemometrics in foodomics: Handling data structures from multiple analytical platforms , 2014 .

[13]  Huanwen Chen,et al.  Combining desorption electrospray ionization mass spectrometry and nuclear magnetic resonance for differential metabolomics without sample preparation. , 2006, Rapid communications in mass spectrometry : RCM.

[14]  Jean-Philippe Antignac,et al.  Statistical strategies for relating metabolomics and proteomics data: a real case study in nutrition research area , 2012, Metabolomics.

[15]  J. S. Marron,et al.  SWISS MADE: Standardized WithIn Class Sum of Squares to Evaluate Methodologies and Dataset Elements , 2010, PloS one.

[16]  Xi-jun Wang,et al.  Modern analytical techniques in metabolomics analysis. , 2012, The Analyst.

[17]  Johan Trygg,et al.  Chemometrics in metabonomics. , 2007, Journal of proteome research.

[18]  Tom F. Wilderjans,et al.  Performing DISCO-SCA to search for distinctive and common information in linked data , 2013, Behavior Research Methods.

[19]  Tommy Löfstedt,et al.  Global, local and unique decompositions in OnPLS for multiblock data analysis. , 2013, Analytica chimica acta.

[20]  Eric F Lock,et al.  JOINT AND INDIVIDUAL VARIATION EXPLAINED (JIVE) FOR INTEGRATED ANALYSIS OF MULTIPLE DATA TYPES. , 2011, The annals of applied statistics.

[21]  Daniel Raftery,et al.  Principal component analysis of urine metabolites detected by NMR and DESI–MS in patients with inborn errors of metabolism , 2007, Analytical and bioanalytical chemistry.

[22]  Francis L Martin,et al.  Biospectroscopy to metabolically profile biomolecular structure: a multistage approach linking computational analysis with biomarkers. , 2011, Journal of proteome research.

[23]  J. Lindon,et al.  'Metabonomics': understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. , 1999, Xenobiotica; the fate of foreign compounds in biological systems.

[24]  H. Abdi,et al.  Multiple factor analysis: principal component analysis for multitable and multiblock data sets , 2013 .

[25]  Steven A. Brown,et al.  The human circadian metabolome , 2012, Proceedings of the National Academy of Sciences.

[26]  Stephen J. Bruce,et al.  Investigation of human blood plasma sample preparation for performing metabolomics using ultrahigh performance liquid chromatography/mass spectrometry. , 2009, Analytical chemistry.

[27]  H. Heise,et al.  FTIR spectroscopy of biofluids revisited: an automated approach to spectral biomarker identification. , 2013, The Analyst.

[28]  L. E. Wangen,et al.  A multiblock partial least squares algorithm for investigating complex chemical systems , 1989 .

[29]  D. Vuckovic Current trends and challenges in sample preparation for global metabolomics using liquid chromatography–mass spectrometry , 2012, Analytical and Bioanalytical Chemistry.

[30]  S. Lê,et al.  BMC Genomics BioMed Central Methodology article Simultaneous analysis of distinct Omics data sets with integration of biological knowledge: Multiple Factor Analysis approach , 2008 .

[31]  I. Mechelen,et al.  Identifying common and distinctive processes underlying multiset data , 2013 .

[32]  Royston Goodacre,et al.  Multiple metabolomics of uropathogenic E. coli reveal different information content in terms of metabolic potential compared to virulence factors. , 2014, The Analyst.

[33]  R. Weiss,et al.  Kidney tumor biomarkers revealed by simultaneous multiple matrix metabolomics analysis. , 2012, Cancer research.

[34]  Iven Van Mechelen,et al.  UvA-DARE ( Digital Academic Repository ) A structured overview of simultaneous component based data integration , 2009 .

[35]  L. De Lathauwer,et al.  DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes , 2012, PloS one.

[36]  Erik Goormaghtigh,et al.  The effect of anticancer drugs on seven cell lines monitored by FTIR spectroscopy. , 2012, The Analyst.

[37]  O. Fiehn Metabolomics – the link between genotypes and phenotypes , 2004, Plant Molecular Biology.