Illuminating the dark matter in metabolomics

Despite the over 100-y history of mass spectrometry, it remains challenging to link the large volume of known chemical structures to the data obtained with mass spectrometers. Presently, only 1.8% of spectra in an untargeted metabolomics experiment can be annotated. This means that the vast majority of information collected by metabolomics is “dark matter,” chemical signatures that remain uncharacterized (Fig. 1). For a genomic comparison, 80% of predicted genes in the Escherichia coli genome are known. In a bacteriophage metagenome, a well-known frontier of biological dark matter, the amount of known genes is 1–30%, depending on the sample (1). Thus, one could argue that we know more about the genetics of uncultured phage than we do about the chemistry within our own bodies. Much of the chemical dark matter may include known structures, but they remain undiscovered because the reference spectra are not available in mass spectrometry databases. The only way to overcome this challenge is through the development of computational solutions. In PNAS, Duhrkop et al. describe the development of such a computational tool, called CSI (compound structure identification):FingerID (2). The tool is designed to aid in the annotation of chemistries that can be observed by mass spectrometry. CSI:FingerID uses fragmentation trees to connect tandem MS (MS/MS) data to chemical structures found in public chemistry databases. Tools such as this can allow metabolomics with mass spectrometry to become as commonly used and scientifically productive as sequencing technologies have in the field of genomics.

[1]  S. Böcker,et al.  Searching molecular structure databases with tandem mass spectra using CSI:FingerID , 2015, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Jiasheng Tu,et al.  Rapid, sensitive and selective liquid chromatography-tandem mass spectrometry (LC-MS/MS) method for the quantification of topically applied azithromycin in rabbit conjunctiva tissues. , 2010, Journal of pharmaceutical and biomedical analysis.

[3]  Andreas Bender,et al.  Understanding and Classifying Metabolite Space and Metabolite-Likeness , 2011, PloS one.

[4]  Bernd Markus Lange,et al.  Open-Access Metabolomics Databases for Natural Product Research: Present Capabilities and Future Potential , 2015, Front. Bioeng. Biotechnol..

[5]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[6]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[7]  Frederick P. Roth,et al.  Chemical substructures that enrich for biological activity , 2008, Bioinform..

[8]  S. Neumann,et al.  Metabolite profiling and beyond: approaches for the rapid processing and annotation of human blood serum mass spectrometry data , 2013, Analytical and Bioanalytical Chemistry.

[9]  Steffen Neumann,et al.  Tackling CASMI 2012: Solutions from MetFrag and MetFusion , 2013, Metabolites.

[10]  Tomáš Pluskal,et al.  Highly accurate chemical formula prediction tool utilizing high-resolution mass spectra, MS/MS fragmentation, heuristic rules, and isotope pattern matching. , 2012, Analytical chemistry.

[11]  O. Fiehn,et al.  Using fragmentation trees and mass spectral trees for identifying unknown compounds in metabolomics. , 2015, Trends in analytical chemistry : TRAC.

[12]  Oliver Fiehn,et al.  Advances in structure elucidation of small molecules using mass spectrometry , 2010, Bioanalytical reviews.

[13]  Russ Greiner,et al.  Competitive fragmentation modeling of ESI-MS/MS spectra for putative metabolite identification , 2013, Metabolomics.

[14]  F. Rohwer,et al.  Metagenomics and future perspectives in virus discovery , 2012, Current Opinion in Virology.

[15]  Christoph Steinbeck,et al.  The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013 , 2012, Nucleic Acids Res..

[16]  Juho Rousu,et al.  Metabolite identification and molecular fingerprint prediction through machine learning , 2012, Bioinform..

[17]  Pieter C. Dorrestein,et al.  Mass spectrometry of natural products: current, emerging and future technologies. , 2014, Natural product reports.