Auto-deconvolution and molecular networking of gas chromatography–mass spectrometry data

We engineered a machine learning approach, MSHub, to enable auto-deconvolution of gas chromatography–mass spectrometry (GC–MS) data. We then designed workflows to enable the community to store, process, share, annotate, compare and perform molecular networking of GC–MS data within the Global Natural Product Social (GNPS) Molecular Networking analysis platform. MSHub/GNPS performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization and quantifies the reproducibility of fragmentation patterns across samples. A machine learning workflow enables auto-deconvolution of gas chromatography–mass spectrometry data.

Wout Bittremieux | Zheng Zhang | Yann Guitton | Mingxun Wang | Rob Knight | Madeleine Ernst | Justin J J van der Hooft | Itzhak Mizrahi | Ivan Laponogov | Pieter C Dorrestein | Alexey V Melnik | Louis Felix Nothias | Daniel Petras | Biswapriya B Misra | Reza Mirnezami | James T. Morton | Ilaria Belluomo | Dennis Veselkov | Mélissa Nothias-Esposito | Kathleen Dorrestein | Morgan Panitchpakdi | Chiara Carazzone | Adolfo Amézquita | Chris Callewaert | Amina Bouslimani | Sneha P. Couvillion | Meagan C. Burnet | Viatcheslav Artaev | Elizabeth Humston-Fulmer | Rachel Gregor | Stav Eyal | Brooke Anderson | Raphaël Lugan | Pauline Le Boulch | Stephanie Prevost | Audrey Poirier | Gaud Dervilly | Aaron Fait | Noga Sikron Persi | Chao Song | Kelem Gashu | Roxana Coras | Monica Guma | Julia Manasson | Vasilis Vasiliou | Kirill Veselkov | Thomas O Metz | Alisdair R Fernie | Dinesh Kumar Barupal | Bruno Le Bizec | George B Hanna | James T Morton | Alexander A Aksenov | Sophie L F Doran | Katherine N Maloney | Aleksandr Smirnov | Xiuxia Du | Kenneth L Jones | Mabel Gonzalez | Robert A Quinn | Andrea Albarracín Orio | Andrea M Smania | Sneha P Couvillion | Meagan C Burnet | Carrie D Nicora | Erika Zink | Michael M Meijler | Rachel Dutton | Jose U Scher | Saleh Alseekh | Robin Schmid | Roman S Borisov | Larisa N Kulikova | R. Knight | P. Dorrestein | B. Le Bizec | G. Hanna | I. Laponogov | K. Veselkov | V. Vasiliou | A. Fernie | I. Mizrahi | D. Barupal | W. Bittremieux | Zheng Zhang | T. Metz | A. Fait | K. Jones | R. Borisov | C. Nicora | Y. Guitton | J. Scher | G. Dervilly | A. Aksenov | A. Melnik | M. Meijler | R. Dutton | J. V. D. van der Hooft | R. Mirnezami | Madeleine Ernst | C. Callewaert | R. Quinn | D. Petráš | Stav Eyal | Amina Bouslimani | Kathleen Dorrestein | A. Amézquita | K. Maloney | B. Misra | L. Nothias | Mingxun Wang | J. Manasson | S. Alseekh | S. Doran | E. Humston-Fulmer | R. Lugan | A. Smania | Mélissa Nothias-Esposito | Xiuxia Du | Erika M. Zink | D. Veselkov | M. Guma | S. Prévost | Kelem Gashu | V. Artaev | R. Coras | Chao Song | C. Carazzone | I. Belluomo | R. Gregor | Robin Schmid | Aleksandr Smirnov | Morgan Panitchpakdi | Mabel Gonzalez | L. Kulikova | A. Poirier | A. Orio | M. Panitchpakdi | Brooke Anderson | P. L. Boulch | Wout Bittremieux | R. Knight | Louis-Félix Nothias

[1]  Christoph Steinbeck,et al.  MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data , 2012, Nucleic Acids Res..

[2]  M. Hirai,et al.  MassBank: a public repository for sharing mass spectral data for life sciences. , 2010, Journal of mass spectrometry : JMS.

[3]  G. Siuzdak,et al.  XCMS Online: a web-based platform to process untargeted metabolomic data. , 2012, Analytical chemistry.

[4]  Stephen Stein,et al.  Mass spectral reference libraries: an ever-expanding resource for chemical identification. , 2012, Analytical chemistry.

[5]  Alexander Goesmann,et al.  MeltDB 2.0–advances of the metabolomics software system , 2013, Bioinform..

[6]  Nigel W. Hardy,et al.  Proposed minimum reporting standards for chemical analysis , 2007, Metabolomics.

[7]  Kristian Fog Nielsen,et al.  Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking , 2016, Nature Biotechnology.

[8]  Masanori Arita,et al.  MS-DIAL: Data Independent MS/MS Deconvolution for Comprehensive Metabolome Analysis , 2015, Nature Methods.

[9]  P. Pevzner,et al.  Spectral Dictionaries , 2009, Molecular & Cellular Proteomics.

[10]  Theodore Alexandrov,et al.  3D molecular cartography using LC–MS facilitated by Optimus and 'ili software , 2017, Nature Protocols.

[11]  A. Harvey Millar,et al.  The MetabolomeExpress Project: enabling web-based processing, analysis and transparent dissemination of GC/MS metabolomics datasets , 2010, BMC Bioinformatics.

[12]  Dirk Walther,et al.  Mass spectral search and analysis using the Golm Metabolome Database , 2012 .

[13]  Oliver Fiehn,et al.  The volatile compound BinBase mass spectral database , 2011, BMC Bioinformatics.

[14]  F. Arnaud,et al.  From core referencing to data re-use: two French national initiatives to reinforce paleodata stewardship (National Cyber Core Repository and LTER France Retro-Observatory) , 2017 .

[15]  R. Bro,et al.  Solving GC-MS problems with PARAFAC2 , 2008 .

[16]  Wei Jia,et al.  ADAP-GC 4.0: Application of Clustering-Assisted Multivariate Curve Resolution to Spectral Deconvolution of Gas Chromatography-Mass Spectrometry Metabolomics Data. , 2019, Analytical chemistry.

[17]  Marta Díaz,et al.  eRah: A Computational Tool Integrating Spectral Deconvolution and Alignment with Quantification and Identification of Metabolites in GC/MS-Based Metabolomics. , 2016, Analytical chemistry.