Assignment of MS-based metabolomic datasets via compound interaction pair mapping

Assignment of physical meaning to mass spectrometry (MS) data peaks is an important scientific challenge for metabolomics investigators. Improvements in instrumental mass accuracy reduce the number of spurious database matches, however, this alone is insufficient for accurate, unique high-throughput assignment. We present a method for clustering MS instrumental artifacts and a stochastic local search algorithm for the automated assignment of large, complex MS-based metabolomic datasets. Artifact peaks and their associated source peaks are grouped into “instrumental clusters.” Instrumental clusters, peaks grouped together by shared peak shape in the temporal domain, serve as a guide for the number of assignments necessary to completely explain a given dataset. We refine mass only assignments through the intersection of peak correlation pairs with a database of biochemically relevant interaction pairs. Further refinement is achieved through a stochastic local search optimization algorithm that selects individual assignments for each instrumental cluster. The algorithm works by choosing the peak assignment that maximally explains the connectivity of a given cluster. We demonstrate that this methodology provides a significant advantage over standard methods for the assignment of metabolites in a UPLC-MS diabetes dataset.

[1]  Oliver Fiehn,et al.  Metabolomic database annotations via query of elemental compositions: Mass accuracy is insufficient even at less than 1 ppm , 2006, BMC Bioinformatics.

[2]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[3]  R. Breitling,et al.  Precision mapping of the metabolome. , 2006, Trends in biotechnology.

[4]  D. Kell Metabolomics and systems biology: making sense of the soup. , 2004, Current opinion in microbiology.

[5]  B. Hammock,et al.  Mass spectrometry-based metabolomics. , 2007, Mass spectrometry reviews.

[6]  Ralf Steuer,et al.  Review: On the analysis and interpretation of correlations in metabolomic data , 2006, Briefings Bioinform..

[7]  Rainer Breitling,et al.  Ab initio prediction of metabolic networks using Fourier transform mass spectrometry data , 2006, Metabolomics.

[8]  G. Siuzdak,et al.  The Expanding Role of Mass Spectrometry in Metabolite Profiling and Characterization , 2005, Chembiochem : a European journal of chemical biology.

[9]  Jürgen Kurths,et al.  Observing and Interpreting Correlations in Metabolic Networks , 2003, Bioinform..

[10]  Jochen Förster,et al.  A functional genomics approach using metabolomics and in silico pathway analysis. , 2002, Biotechnology and bioengineering.

[11]  Susumu Goto,et al.  LIGAND: chemical database for enzyme reactions , 1998, Bioinform..

[12]  O. Fiehn,et al.  Interpreting correlations in metabolomic networks. , 2003, Biochemical Society transactions.

[13]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[14]  Pedro Mendes,et al.  Emerging bioinformatics for the metabolome , 2002, Briefings Bioinform..

[15]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[16]  Yves Gibon,et al.  GMD@CSB.DB: the Golm Metabolome Database , 2005, Bioinform..

[17]  J. Ross,et al.  A Test Case of Correlation Metric Construction of a Reaction Pathway from Measurements , 1997 .