Mass Spectra-Based Framework for Automated Structural Elucidation of Metabolome Data to Explore Phytochemical Diversity

A novel framework for automated elucidation of metabolite structures in liquid chromatography–mass spectrometer metabolome data was constructed by integrating databases. High-resolution tandem mass spectra data automatically acquired from each metabolite signal were used for database searches. Three distinct databases, KNApSAcK, ReSpect, and the PRIMe standard compound database, were employed for the structural elucidation. The outputs were retrieved using the CAS metabolite identifier for identification and putative annotation. A simple metabolite ontology system was also introduced to attain putative characterization of the metabolite signals. The automated method was applied for the metabolome data sets obtained from the rosette leaves of 20 Arabidopsis accessions. Phenotypic variations in novel Arabidopsis metabolites among these accessions could be investigated using this method.

[1]  Evolution of nitrilases in glucosinolate-containing plants. , 2009, Phytochemistry.

[2]  R. Mott,et al.  The 1001 Genomes Project for Arabidopsis thaliana , 2009, Genome Biology.

[3]  W. Liang,et al.  TM4 microarray software suite. , 2006, Methods in enzymology.

[4]  Oliver Fiehn,et al.  Advances in structure elucidation of small molecules using mass spectrometry , 2010, Bioanalytical reviews.

[5]  S. Altschul,et al.  Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage. , 1985, Molecular biology and evolution.

[6]  Visualization of metabolite identifier information , 2009 .

[7]  Shigehiko Kanaya,et al.  Metabolomics approach for determining growth-specific metabolites based on Fourier transform ion cyclotron resonance mass spectrometry , 2008, Analytical and bioanalytical chemistry.

[8]  D. Scott,et al.  Optimization and testing of mass spectral library search algorithms for compound identification , 1994, Journal of the American Society for Mass Spectrometry.

[9]  Akira Oikawa,et al.  Assessment of Metabolome Annotation Quality: A Method for Evaluating the False Discovery Rate of Elemental Composition Searches , 2009, PloS one.

[10]  Richard M. Clark,et al.  Common Sequence Polymorphisms Shaping Genetic Diversity in Arabidopsis thaliana , 2007, Science.

[11]  Bjarne Gram Hansen,et al.  Subclade of Flavin-Monooxygenases Involved in Aliphatic Glucosinolate Biosynthesis1[W] , 2008, Plant Physiology.

[12]  Nigel W. Hardy,et al.  The metabolomics standards initiative (MSI) , 2007, Metabolomics.

[13]  Richard M. Clark,et al.  Sequencing of natural strains of Arabidopsis thaliana with short reads. , 2008, Genome research.

[14]  M. Kwon,et al.  Dirigent proteins and dirigent sites in lignifying tissues. , 2001, Phytochemistry.

[15]  Anne Osbourn,et al.  Plant-Microbe Interactions: Chemical Diversity in Plant Defense , 2009, Science.

[16]  Nigel W. Hardy,et al.  Proposed minimum reporting standards for chemical analysis , 2007, Metabolomics.

[17]  M. Reichelt,et al.  Gene Duplication in the Diversification of Secondary Metabolism: Tandem 2-Oxoglutarate–Dependent Dioxygenases Control Glucosinolate Biosynthesis in Arabidopsis , 2001, Plant Cell.

[18]  Jingyuan Fu,et al.  The genetics of plant metabolism , 2006, Nature Genetics.

[19]  J. Keurentjes,et al.  Untargeted large-scale plant metabolomics using liquid chromatography coupled to mass spectrometry , 2007, Nature Protocols.

[20]  Daniel J. Kliebenstein,et al.  Linking Metabolic QTLs with Network and cis-eQTLs Controlling Biosynthetic Pathways , 2007, PLoS genetics.

[21]  A. Ishihara,et al.  Metabolic profiling analysis of genetically modified rice seedlings that overproduce tryptophan reveals the occurrence of its inter-tissue translocation , 2010 .

[22]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[23]  M. Hirai,et al.  MassBank: a public repository for sharing mass spectral data for life sciences. , 2010, Journal of mass spectrometry : JMS.

[24]  D. Scheel,et al.  Evaluation of matrix effects in metabolite profiling based on capillary liquid chromatography electrospray ionization quadrupole time-of-flight mass spectrometry. , 2007, Analytical chemistry.

[25]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[26]  S. Böcker,et al.  Computational mass spectrometry for metabolomics: Identification of metabolites and small molecules , 2010, Analytical and bioanalytical chemistry.

[27]  Gunnar Rätsch,et al.  Detecting polymorphic regions in Arabidopsis thaliana with resequencing microarrays. , 2008, Genome research.

[28]  T. Umezawa,et al.  Characterization of Arabidopsis thaliana Pinoresinol Reductase, a New Type of Enzyme Involved in Lignan Biosynthesis* , 2008, Journal of Biological Chemistry.

[29]  Arjen Lommen,et al.  MetAlign: interface-driven, versatile metabolomics tool for hyphenated full-scan mass spectrometry data preprocessing. , 2009, Analytical chemistry.

[30]  R. Bino,et al.  Metabolomics technologies and metabolite identification , 2007 .

[31]  J. Gershenzon,et al.  From Amino Acid to Glucosinolate Biosynthesis: Protein Sequence Changes in the Evolution of Methylthioalkylmalate Synthase in Arabidopsis[W][OA] , 2011, Plant Cell.

[32]  Kazuo Shinozaki,et al.  MS/MS spectral tag-based annotation of non-targeted profile of plant secondary metabolites , 2008, The Plant journal : for cell and molecular biology.

[33]  S. Kanaya,et al.  KNApSAcK: A Comprehensive Species-Metabolite Relationship Database , 2006 .

[34]  T. Mitchell-Olds,et al.  Variation and fitness costs for tolerance to different types of herbivore damage in Boechera stricta genotypes with contrasting glucosinolate structures. , 2010, The New phytologist.

[35]  G. Zeller,et al.  Comprehensive analysis of Arabidopsis expression level polymorphisms with simple inheritance , 2009, Molecular systems biology.

[36]  A I Saeed,et al.  TM4: a free, open-source system for microarray data management and analysis. , 2003, BioTechniques.

[37]  Frederique Lisacek,et al.  X-Rank: a robust algorithm for small molecule identification using tandem mass spectrometry. , 2009, Analytical chemistry.

[38]  Nigel W. Hardy,et al.  The Metabolomics Standards Initiative , 2007, Nature Biotechnology.

[39]  Kazuki Saito,et al.  Metabolomics for functional genomics, systems biology, and biotechnology. , 2010, Annual review of plant biology.

[40]  Takayuki Tohge,et al.  Metabolomics-oriented isolation and structure elucidation of 37 compounds including two anthocyanins from Arabidopsis thaliana. , 2009, Phytochemistry.

[41]  Yuji Sawada,et al.  Omics-based approaches to methionine side chain elongation in Arabidopsis: characterization of the genes encoding methylthioalkylmalate isomerase and methylthioalkylmalate dehydrogenase. , 2009, Plant & cell physiology.

[42]  Kenji Akiyama,et al.  AtMetExpress Development: A Phytochemical Atlas of Arabidopsis Development[W][OA] , 2009, Plant Physiology.

[43]  L. Davin,et al.  Lignin primary structures and dirigent sites. , 2005, Current opinion in biotechnology.

[44]  Roeland C. H. J. van Ham,et al.  Accurate mass error correction in liquid chromatography time-of-flight mass spectrometry based metabolomics , 2008, Metabolomics.

[45]  Tetsuya Sakurai,et al.  PRIMe: A Web Site That Assembles Tools for Metabolomics and Transcriptomics , 2008, Silico Biol..

[46]  Rebecca L Poole The TAIR database. , 2007, Methods in molecular biology.

[47]  S. Kanaya,et al.  Summary , 1940, Intellectual Property in the Conflict of Laws.

[48]  W. Liang,et al.  9) TM4 Microarray Software Suite , 2006 .

[49]  Ying Zhang,et al.  HMDB: the Human Metabolome Database , 2007, Nucleic Acids Res..