Multilevel component analysis of time-resolved metabolic fingerprinting data

Genomics-based technologies in systems biology have gained a lot of popularity in recent years. These technologies generate large amounts of data. To obtain information from this data, multivariate data analysis methods are required. Many of the datasets generated in genomics are multilevel datasets, in which the variation occurs on different levels simultaneously (e.g. variation between organisms and variation in time). We introduce multilevel component analysis (MCA) into the field of metabolic fingerprinting to separate these different types of variation. This is in contrast to the commonly used principal component analysis (PCA) that is not capable of doing this: in a PCA model the different types of variation in a multilevel dataset are confounded. MCA generates different submodels for different types of variation. These submodels are lower-dimensional component models in which the variation is approximated. These models are easier to interpret than the original data. Multilevel simultaneous component analysis (MSCA) is a method within the class of MCA models with increased interpretability, due to the fact that the time-resolved variation of all individuals is expressed in the same subspace. MSCA is applied on a time-resolved metabolomics dataset. This dataset contains 1H NMR spectra of urine collected from 10 monkeys at 29 time-points during 2 months. The MSCA model contains a submodel describing the biorhythms in the urine composition and a submodel describing the variation between the animals. Using MSCA the largest biorhythms in the urine composition and the largest variation between the animals are identified. Comparison of the MSCA model to a PCA model of this data shows that the MSCA model is better interpretable: the MSCA model gives a better view on the different types of variation in the data since they are not confounded. © 2004 Elsevier B.V. All rights reserved.

[1]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[2]  C. Beddell,et al.  Automatic data reduction and pattern recognition methods for analysis of 1H nuclear magnetic resonance spectra of human urine from normal and pathological states. , 1994, Analytical biochemistry.

[3]  A C Tas,et al.  Direct chemical ionization-mass spectrometric profiling of urine in premenstrual syndrome. , 1989, Journal of pharmaceutical and biomedical analysis.

[4]  Jildau Bouwman,et al.  Evaluation of field-desorption and fast atom-bombardment mass spectrometric profiles by pattern recognition techniques , 1983 .

[5]  Age K. Smilde,et al.  Analysis of longitudinal metabolomics data , 2004, Bioinform..

[6]  I. Jolliffe Principal Component Analysis , 2002 .

[7]  E Holmes,et al.  Chemometric contributions to the evolution of metabonomics: mathematical solutions to characterising and interpreting complex biological NMR spectra. , 2002, The Analyst.

[8]  Pedro Mendes,et al.  Emerging bioinformatics for the metabolome , 2002, Briefings Bioinform..

[9]  P J Sadler,et al.  Use of high-resolution proton nuclear magnetic resonance spectroscopy for rapid multi-component analysis of urine. , 1984, Clinical chemistry.

[10]  J. Lindon,et al.  Metabolism of 4-fluoroaniline and 4-fluorobiphenyl in the earthworm Eisenia veneta characterized by high-resolution NMR spectroscopy with directly coupled HPLC-NMR and HPLC-MS , 2002, Xenobiotica; the fate of foreign compounds in biological systems.

[11]  P J Sadler,et al.  Proton NMR spectra of urine as indicators of renal damage. Mercury-induced nephrotoxicity in rats. , 1985, Molecular pharmacology.

[12]  Cécile Canlet,et al.  Metabonomic assessment of physiological disruptions using 1H-13C HMBC-NMR spectroscopy combined with pattern recognition procedures performed on filtered variables. , 2002, Analytical chemistry.

[13]  R. Bro,et al.  Centering and scaling in component analysis , 2003 .

[14]  Henk A. L. Kiers,et al.  Hierarchical relations between methods for simultaneous component analysis and a technique for rotation to a simple simultaneous structure , 1994 .

[15]  J C Lindon,et al.  Pattern recognition analysis of high resolution 1H NMR spectra of urine. A nonlinear mapping approach to the classification of toxicological data , 1990, NMR in biomedicine.

[16]  J. Nicholson,et al.  High‐resolution 1H NMR and magic angle spinning NMR spectroscopic investigation of the biochemical effects of 2‐bromoethanamine in intact renal and hepatic tissue , 2001, Magnetic resonance in medicine.

[17]  J. van der Greef,et al.  Partial linear fit: A new NMR spectroscopy preprocessing tool for pattern recognition applications , 1996 .

[18]  C Zuppi,et al.  1H NMR spectra of normal urines: reference ranges of the major metabolites. , 1997, Clinica chimica acta; international journal of clinical chemistry.

[19]  Timothy M. D. Ebbels,et al.  Batch statistical processing of 1H NMR‐derived urinary spectral data , 2002 .

[20]  T. Berge Least squares optimization in multivariate analysis , 2005 .

[21]  Marieke E. Timmerman,et al.  Four simultaneous component models for the analysis of multivariate time series from more than one subject to model intraindividual and interindividual differences , 2003 .