Chemically-informed Analyses of Metabolomics Mass Spectrometry Data with Qemistree

Untargeted mass spectrometry is employed to detect small molecules in complex biospecimens, generating data that are difficult to interpret. We developed Qemistree, a data exploration strategy based on hierarchical organization of molecular fingerprints predicted from fragmentation spectra, represented in the context of sample metadata and chemical ontologies. By expressing molecular relationships as a tree, we can apply ecological tools, designed around the relatedness of DNA sequences, to study chemical composition.

[1]  Jessica Lowell Neural Network , 2001 .

[2]  Andrea Porzel,et al.  Discovering Regulated Metabolite Families in Untargeted Metabolomics Studies. , 2016, Analytical chemistry.

[3]  S. Böcker,et al.  Searching molecular structure databases with tandem mass spectra using CSI:FingerID , 2015, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Noureddin Sadawi,et al.  ChemDistiller: an engine for metabolite annotation in mass spectrometry , 2018, Bioinform..

[5]  Jian Wang,et al.  Assembling the Community-Scale Discoverable Human Proteome , 2018, Cell systems.

[6]  Habtom W. Ressom,et al.  Metabolite Identification Using Artificial Neural Network , 2019, 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[7]  Natalie I. Tasman,et al.  A Cross-platform Toolkit for Mass Spectrometry and Proteomics , 2012, Nature Biotechnology.

[8]  Lennart Martens,et al.  mzML—a Community Standard for Mass Spectrometry Data* , 2010, Molecular & Cellular Proteomics.

[9]  S Joseph Wright,et al.  Sources of variation in foliar secondary chemistry in a tropical forest tree community. , 2017, Ecology.

[10]  Nuno Bandeira,et al.  Mass spectral molecular networking of living microbial colonies , 2012, Proceedings of the National Academy of Sciences.

[11]  Rick L. Stevens,et al.  A communal catalogue reveals Earth’s multiscale microbial diversity , 2017, Nature.

[12]  J. T. Curtis,et al.  An Ordination of the Upland Forest Communities of Southern Wisconsin , 1957 .

[13]  Rob Knight,et al.  Phylogenetic Placement of Exact Amplicon Sequences Improves Associations with Clinical Information , 2018, mSystems.

[14]  Juho Rousu,et al.  Metabolite identification and molecular fingerprint prediction through machine learning , 2012, Bioinform..

[15]  P. Bork,et al.  Interactive Tree Of Life (iTOL) v4: recent updates and new developments , 2019, Nucleic Acids Res..

[16]  Nigel W. Hardy,et al.  Proposed minimum reporting standards for chemical analysis , 2007, Metabolomics.

[17]  Juho Rousu,et al.  SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information , 2019, Nature Methods.

[18]  Tobias Depke,et al.  Clustering of MS2 spectra using unsupervised methods to aid the identification of secondary metabolites from Pseudomonas aeruginosa. , 2017, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[19]  Peer Bork,et al.  Identification of metabolites from tandem mass spectra with a machine learning approach utilizing structural features , 2020, Bioinform..

[20]  D. Faith Conservation evaluation and phylogenetic diversity , 1992 .

[21]  Shiv Meka,et al.  Hierarchical clustering of MS/MS spectra from the firefly metabolome identifies new lucibufagin compounds , 2020, Scientific Reports.

[22]  Peer Bork,et al.  Identification of metabolites from tandem mass spectra with a machine learning approach utilizing structural features , 2019, bioRxiv.

[23]  Rob Knight,et al.  Metabolome-Informed Microbiome Analysis Refines Metadata Classifications and Reveals Unexpected Medication Transfer in Captive Cheetahs , 2020, mSystems.

[24]  Juho Rousu,et al.  Critical Assessment of Small Molecule Identification 2016: automated methods , 2017, Journal of Cheminformatics.

[25]  Thomas Zichner,et al.  Identifying the unknowns by aligning fragmentation trees. , 2012, Analytical chemistry.

[26]  Francesco Asnicar,et al.  Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2 , 2019, Nature Biotechnology.

[27]  Peter Willett,et al.  Similarity-based virtual screening using 2D fingerprints. , 2006, Drug discovery today.

[28]  James R. Foulds,et al.  Learning accurate representations of microbe-metabolite interactions , 2019, Nature Methods.

[29]  Lawrence A. David,et al.  Phylogenetic factorization of compositional data yields lineage-level associations in microbiome datasets , 2017, PeerJ.

[30]  Kristian Fog Nielsen,et al.  Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking , 2016, Nature Biotechnology.

[31]  Rob Knight,et al.  Metabolome-Informed Microbiome Analysis Refines Metadata Classifications and Reveals Unexpected Medication Transfer in Captive Cheetahs , 2019, mSystems.

[32]  Pieter C. Dorrestein,et al.  ZODIAC: database-independent molecular formula annotation using Gibbs sampling reveals unknown small molecules , 2019, bioRxiv.

[33]  James T. Morton,et al.  Establishing microbial composition measurement standards with reference frames , 2019, Nature Communications.

[34]  Simon Rogers,et al.  Feature-Based Molecular Networking in the GNPS Analysis Environment , 2019, Nature Methods.

[35]  Matej Oresic,et al.  MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data , 2010, BMC Bioinformatics.

[36]  Rob Knight,et al.  American Gut: an Open Platform for Citizen Science Microbiome Research , 2018, mSystems.

[37]  Sebastian Böcker,et al.  Fragmentation trees reloaded , 2014, Journal of Cheminformatics.

[38]  Rob Knight,et al.  Striped UniFrac: enabling microbiome analysis at unprecedented scale , 2018, Nature Methods.

[39]  Pierre Champy,et al.  Natural products targeting strategies involving molecular networking: different manners, one goal. , 2019, Natural product reports.

[40]  Evan Bolton,et al.  ClassyFire: automated chemical classification with a comprehensive, computable taxonomy , 2016, Journal of Cheminformatics.

[41]  Stephen E. Stein,et al.  Metabolite profiling of a NIST Standard Reference Material for human plasma (SRM 1950): GC-MS, LC-MS, NMR, and clinical laboratory analyses, libraries, and web-based resources. , 2013, Analytical chemistry.