The MetaboLights repository: curation challenges in metabolomics

MetaboLights is the first general-purpose open-access curated repository for metabolomic studies, their raw experimental data and associated metadata, maintained by one of the major open-access data providers in molecular biology. Increases in the number of depositions, number of samples per study and the file size of data submitted to MetaboLights present a challenge for the objective of ensuring high-quality and standardized data in the context of diverse metabolomic workflows and data representations. Here, we describe the MetaboLights curation pipeline, its challenges and its practical application in quality control of complex data depositions. Database URL: http://www.ebi.ac.uk/metabolights

[1]  Nigel W. Hardy,et al.  The Metabolomics Standards Initiative , 2007, Nature Biotechnology.

[2]  Oliver Hofmann,et al.  ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level , 2010, Bioinform..

[3]  Oliver Fiehn,et al.  Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry , 2007, BMC Bioinformatics.

[4]  Yanli Wang,et al.  PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[5]  Carole A. Goble,et al.  Towards BioDBcore: a community-defined information specification for biological databases , 2011, Database : the journal of biological databases and curation.

[6]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..

[7]  Ying Zhang,et al.  HMDB: the Human Metabolome Database , 2007, Nucleic Acids Res..

[8]  Nicolas Le Novère,et al.  Identifiers.org and MIRIAM Registry: community resources to provide persistent identification , 2011, Nucleic Acids Res..

[9]  Anne E. Trefethen,et al.  Toward interoperable bioscience data , 2012, Nature Genetics.

[10]  Tin Wee Tan,et al.  Towards BioDBcore: a community-defined information specification for biological databases , 2010, Database J. Biol. Databases Curation.

[11]  Chris F. Taylor,et al.  The work of the Human Proteome Organisation's Proteomics Standards Initiative (HUPO PSI). , 2006, Omics : a journal of integrative biology.

[12]  Patricia L. Whetzel,et al.  OntoMaton: a Bioportal powered ontology widget for Google Spreadsheets , 2012, Bioinform..

[13]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[14]  Christoph Steinbeck,et al.  Bioinformatics Meets User-Centred Design: A Perspective , 2012, PLoS Comput. Biol..

[15]  Lennart Martens,et al.  The Proteomics Identifications database: 2010 update , 2009, Nucleic Acids Res..

[16]  Christoph Steinbeck,et al.  MetaboLights: towards a new COSMOS of metabolomics data management , 2012, Metabolomics.

[17]  Yves Gibon,et al.  GMD@CSB.DB: the Golm Metabolome Database , 2005, Bioinform..

[18]  Douglas B. Kell,et al.  Proposed minimum reporting standards for data analysis in metabolomics , 2007, Metabolomics.

[19]  Christoph Steinbeck,et al.  MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data , 2012, Nucleic Acids Res..

[20]  D. Wishart Advances in metabolite identification. , 2011, Bioanalysis.

[21]  A. Harvey Millar,et al.  The MetabolomeExpress Project: enabling web-based processing, analysis and transparent dissemination of GC/MS metabolomics datasets , 2010, BMC Bioinformatics.

[22]  Christoph Steinbeck,et al.  Chemical Entities of Biological Interest: an update , 2009, Nucleic Acids Res..