Simple data-reduction method for high-resolution LC-MS data in metabolomics.

BACKGROUND Metabolomics LC-MS experiments yield large numbers of peaks, few of which can be identified by database matching. Many of the remaining peaks correspond to derivatives of identified peaks (e.g., isotope peaks, adducts, fragments and multiply charged molecules). In this article, we present a data-reduction approach that automatically identifies these derivative peaks. RESULTS Using data-driven clustering based on chromatographic peak shape correlation and intensity patterns across biological replicates, derivative peaks can be reliably identified. Using a test data set obtained from Leishmania donovani extracts, we achieved a 60% reduction of the number of peaks. After quality control filtering, almost 80% of the peaks could putatively be identified by database matching. CONCLUSION Automated peak filtering substantially speeds up the data-interpretation process.

[1]  Jingyuan Fu,et al.  System-wide molecular evidence for phenotypic buffering in Arabidopsis , 2009, Nature Genetics.

[2]  Malcolm J. McConville,et al.  LeishCyc: a biochemical pathways database for Leishmania major , 2009, BMC Systems Biology.

[3]  D. Kell,et al.  Metabolic profiling of serum using Ultra Performance Liquid Chromatography and the LTQ-Orbitrap mass spectrometry system. , 2008, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[4]  Vinod P. Shah,et al.  Bioanalytical Method Validation—A Revisit with a Decade of Progress , 2000, Pharmaceutical Research.

[5]  Yury Tikunov,et al.  A Novel Approach for Nontargeted Data Analysis for Metabolomics. Large-Scale Profiling of Tomato Fruit Volatiles1[w] , 2005, Plant Physiology.

[6]  Christian Gieger,et al.  Genetics Meets Metabolomics: A Genome-Wide Association Study of Metabolite Profiles in Human Serum , 2008, PLoS genetics.

[7]  Kieran Clarke,et al.  A combined 1H-NMR spectroscopy- and mass spectrometry-based metabolomic study of the PPAR-alpha null mutant mouse defines profound systemic changes in metabolism linked to the metabolic syndrome. , 2006, Physiological genomics.

[8]  Jingyuan Fu,et al.  Optimal Design and Analysis of Genetic Studies on Gene Expression , 2006, Genetics.

[9]  R. Whittal,et al.  Interferences and contaminants encountered in modern mass spectrometry. , 2008, Analytica chimica acta.

[10]  Oliver Fiehn,et al.  Combining Genomics, Metabolome Analysis, and Biochemical Modelling to Understand Metabolic Networks , 2001, Comparative and functional genomics.

[11]  Daniel Raftery,et al.  Comparing and combining NMR spectroscopy and mass spectrometry in metabolomics , 2007, Analytical and bioanalytical chemistry.

[12]  Rainer Breitling,et al.  Ab initio prediction of metabolic networks using Fourier transform mass spectrometry data , 2006, Metabolomics.

[13]  Steffen Neumann,et al.  Annotation of LC/ESI-MS Mass Signals , 2007, BIRD.

[14]  Rainer Breitling,et al.  Increasing the mass accuracy of high‐resolution LC‐MS data using background ions – a case study on the LTQ‐Orbitrap , 2008, Proteomics.

[15]  Alexander Erban,et al.  TagFinder for the quantitative analysis of gas chromatography - mass spectrometry (GC-MS)-based metabolite profiling experiments , 2008, Bioinform..

[16]  I. Rusyn,et al.  Towards high-throughput metabolomics using ultrahigh-field Fourier transform ion cyclotron resonance mass spectrometry , 2008, Metabolomics.

[17]  Oliver Fiehn,et al.  Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry , 2007, BMC Bioinformatics.

[18]  J. K. Kim,et al.  Time-course metabolic profiling in Arabidopsis thaliana cell cultures after salt stress treatment. , 2007, Journal of experimental botany.

[19]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[20]  Simon Rogers,et al.  Probabilistic assignment of formulas to mass peaks in metabolomics experiments , 2009, Bioinform..

[21]  Oliver Fiehn,et al.  Mass-spectrometry-based metabolomics: limitations and recommendations for future progress with particular focus on nutrition research , 2009, Metabolomics.

[22]  Age K. Smilde,et al.  Optimized time alignment algorithm for LC-MS data: correlation optimized warping using component detection algorithm-selected mass chromatograms. , 2008, Analytical chemistry.

[23]  Eoin Fahy,et al.  LIPID MAPS online tools for lipid research , 2007, Nucleic Acids Res..

[24]  Susumu Goto,et al.  The KEGG databases at GenomeNet , 2002, Nucleic Acids Res..

[25]  Thomas Hankemeier,et al.  Microbial metabolomics: toward a platform with full metabolome coverage. , 2007, Analytical biochemistry.

[26]  Ying Zhang,et al.  HMDB: the Human Metabolome Database , 2007, Nucleic Acids Res..