Setup and Annotation of Metabolomic Experiments by Integrating Biological and Mass Spectrometric Metadata

Unbiased metabolomic surveys are used for physiological, clinical and genomic studies to infer genotype-phenotype relationships. Long term reusability of metabolomic data needs both correct metabolite annotations and consistent biological classifications. We have developed a system that combines mass spectrometric and biological metadata to achieve this goal. First, an XMLbased LIMS system enables entering biological metadata for steering laboratory workflows by generating ‘classes' that reflect experimental designs. After data acquisition, a relational database system (BinBase) is employed for automated metabolite annotation. It consists of a manifold filtering algorithm for matching and generating database objects by utilizing mass spectral metadata such as ‘retention index', ‘purity', ‘signal/noise', and the biological information class. Once annotations and quantitations are complete for a specific larger experiment, this information is fed back into the LIMS system to notify supervisors and users. Eventually, qualitative and quantitative results are released to the public for downloads or complex queries.

[1]  Gregory R. Grant,et al.  RAD and the RAD Study-Annotator: an approach to collection, organization and exchange of all relevant information for high-throughput gene expression studies , 2004, Bioinform..

[2]  Henry Kelly,et al.  The digital human: towards a unified ontology. , 2003, Omics : a journal of integrative biology.

[3]  Jian Yang,et al.  Metabolomics spectral formatting, alignment and conversion tools (MSFACTs) , 2003, Bioinform..

[4]  F W McLafferty,et al.  Comparison of algorithms and databases for matching unknown mass spectra , 1998, Journal of the American Society for Mass Spectrometry.

[5]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information , 2021, Nucleic Acids Res..

[6]  S. Boag,et al.  XQuery 1.0 : An XML query language, W3C Working Draft 12 November 2003 , 2003 .

[7]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[8]  S. Stein An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data , 1999 .

[9]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[10]  O. Fiehn Metabolomics – the link between genotypes and phenotypes , 2004, Plant Molecular Biology.

[11]  Kazuki Saito,et al.  Potential of metabolomics as a functional genomics tool. , 2004, Trends in plant science.

[12]  L. Stein,et al.  The Plant Ontology (TM) Consortium and plant ontologies , 2002 .

[13]  Yves Gibon,et al.  GMD@CSB.DB: the Golm Metabolome Database , 2005, Bioinform..

[14]  Richard Bruskiewich Meeting Review: Plant Bioinformatics at the NSF and NPGI (PAMGX Satellite) Meetings , 2002, Comparative and functional genomics.

[15]  Nigel W. Hardy,et al.  A proposed framework for the description of plant metabolomics experiments and their results , 2004, Nature Biotechnology.

[16]  Jungwon Yoon,et al.  The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community , 2003, Nucleic Acids Res..

[17]  John N. Haselden,et al.  Standardisation of Reporting Methods for Metabolic Analyses : A Draft Policy Document from the Standard Metabolic Reporting Structures , 2005 .

[18]  Alistair J. P. Brown,et al.  PEDRo: A database for storing, searching and disseminating experimental proteomics data , 2004, BMC Genomics.

[19]  C. Ball,et al.  Submission of Microarray Data to Public Repositories , 2004, PLoS biology.

[20]  Mariusz Kowalczyk,et al.  A strategy for identifying differences in large series of metabolomic samples analyzed by GC/MS. , 2004, Analytical chemistry.

[21]  Ela Hunt,et al.  An object model and database for functional genomics , 2004, Bioinform..