The application of MANOVA to analyse Arabidopsis thaliana metabolomic data from factorially designed experiments

Metabolomic technologies produce complex multivariate datasets and researchers are faced with the daunting task of extracting information from these data. Principal component analysis (PCA) has been widely applied in the field of metabolomics to reduce data dimensionality and for visualising trends within the complex data. Although PCA is very useful, it cannot handle multi-factorial experimental designs and, often, clear trends of biological interest are not observed when plotting various PC combinations. Even if patterns are observed, PCA provides no measure of their significance. Multivariate analysis of variance (MANOVA) applied to these PCs enables the statistical evaluation of main treatments and, more importantly, their interactions within the experimental design. The power and scope of MANOVA is demonstrated through two different factorially designed metabolomic investigations using Arabidopsis ethylene signalling mutants and their wild-type. One investigation has multiple experimental factors including challenge with the economically important pathogen Botrytis cinerea and also replicate experiments, while the second has different sample preparation methods and one level of replication ‘nested’ within the design. In both investigations there are specific factors of biological interest and there are also factors incorporated within the experimental design, which affect the data. The versatility of MANOVA is displayed by using data from two different metabolomic techniques; profiling using direct injection mass spectroscopy (DIMS) and fingerprinting using fourier transform infra-red (FT-IR) spectroscopy. MANOVA found significant main effects and interactions in both experiments, allowing a more complete and comprehensive interpretation of the variation within each investigation, than with PCA alone. Canonical variate analysis (CVA) was applied to investigate these effects and their biological significance. In conclusion, the application of MANOVA followed by CVA provided extra information than PCA alone and proved to be a valuable statistical addition in the overwhelming task of analysing metabolomic data.

[1]  R. Goodacre,et al.  Chemometric Analysis of Diffuse Reflectance-Absorbance Fourier Transform Infrared Spectra Using Rule Induction Methods: Application to the Classification of Eubacterium Species , 1998 .

[2]  D. Kell,et al.  Metabolomics by numbers: acquiring and understanding global metabolite data. , 2004, Trends in biotechnology.

[3]  B. Thomma,et al.  Separate jasmonate-dependent and salicylate-dependent defense-response pathways in Arabidopsis are essential for resistance to distinct microbial pathogens. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[4]  S. Maxwell,et al.  Multivariate Analysis of Variance , 1985 .

[5]  Douglas B. Kell,et al.  Diffuse reflectance absorbance spectroscopy taking in chemometrics (DRASTIC). A hyperspectral FT-IR-based approach to rapid screening for metabolite overproduction , 1997 .

[6]  A. Bleecker,et al.  Ethylene: a gaseous signal molecule in plants. , 2000, Annual review of cell and developmental biology.

[7]  Ø. Langsrud,et al.  50–50 multivariate analysis of variance for collinear responses , 2002 .

[8]  G. S. Lawes,et al.  Manova and discriminant analyses of phenotypic data as a guide for parent selection in kiwifruit (Actinidia deliciosa) breeding , 2000, Euphytica.

[9]  George Henry Dunteman,et al.  Introduction To Multivariate Analysis , 1984 .

[10]  James J. Chen,et al.  Analysis of variance components in gene expression data , 2004, Bioinform..

[11]  J. Görlach,et al.  Growth Stage–Based Phenotypic Analysis of Arabidopsis , 2001, The Plant Cell Online.

[12]  W. Dunn,et al.  Measuring the metabolome: current analytical technologies. , 2005, The Analyst.

[13]  M. Sjöström,et al.  Design of experiments: an efficient strategy to identify factors influencing extraction and derivatization of Arabidopsis thaliana samples in metabolomic studies with gas chromatography/mass spectrometry. , 2004, Analytical biochemistry.

[14]  J. J. Jansen,et al.  ASCA: analysis of multivariate data obtained from an experimental design , 2005 .

[15]  M. Estelle,et al.  Insensitivity to Ethylene Conferred by a Dominant Mutation in Arabidopsis thaliana , 1988, Science.

[16]  Ross D. King,et al.  Application of metabolomics to plant genotype discrimination using statistics and machine learning , 2002, ECCB.

[17]  J. Ecker,et al.  Exploiting the triple response of Arabidopsis to identify ethylene-related mutants. , 1990, The Plant cell.

[18]  Eric R. Ziegel,et al.  Chemometrics: Statistics and Computer Application in Analytical Chemistry , 2001, Technometrics.

[19]  D B Kell,et al.  Rapid identification of urinary tract infection bacteria using hyperspectral whole-organism fingerprinting and artificial neural networks. , 1998, Microbiology.

[20]  G. Quinn,et al.  Experimental Design and Data Analysis for Biologists , 2002 .

[21]  Age K. Smilde,et al.  ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data , 2005, Bioinform..

[22]  R. Dixon,et al.  Plant metabolomics: large-scale phytochemistry in the functional genomics era. , 2003, Phytochemistry.

[23]  Alisdair R. Fernie,et al.  Review: Metabolome characterisation in plant system analysis. , 2003, Functional plant biology : FPB.

[24]  Kazuki Saito,et al.  Potential of metabolomics as a functional genomics tool. , 2004, Trends in plant science.

[25]  Joseph R. Ecker,et al.  CTR1, a negative regulator of the ethylene response pathway in arabidopsis, encodes a member of the Raf family of protein kinases , 1993, Cell.

[26]  J. Kieber,et al.  The molecular basis of ethylene signalling in Arabidopsis. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[27]  D B Kell,et al.  Rapid identification of Streptococcus and Enterococcus species using diffuse reflectance-absorbance Fourier transform infrared spectroscopy and artificial neural networks. , 1996, FEMS microbiology letters.

[28]  R. Goodacre,et al.  Metabolic fingerprinting of salt-stressed tomatoes. , 2003, Phytochemistry.

[29]  O. Fiehn,et al.  Metabolite profiling for plant functional genomics , 2000, Nature Biotechnology.

[30]  Ute Roessner,et al.  Simultaneous analysis of metabolites in potato tuber by gas chromatography-mass spectrometry. , 2000 .

[31]  J. G. Skellam,et al.  Multivariate Statistical Analysis for Biologists , 1965 .

[32]  M. A. Strehle,et al.  Chemotaxonomic characterisation of essential oil plants by vibrational spectroscopy measurements , 2004 .

[33]  B. Thomma,et al.  Concomitant Activation of Jasmonate and Ethylene Response Pathways Is Required for Induction of a Plant Defensin Gene in Arabidopsis , 1998, Plant Cell.