Untargeted LC–MS Data Preprocessing in Metabolomics

Liquid chromatography–mass spectrometry (LC–MS) has enabled the detection of thousands of metabolite features from a single biological sample that produces large and complex datasets. One of the key issues in LC–MS-based metabolomics is comprehensive and accurate analysis of enormous amount of data. Many free data preprocessing tools, such as XCMS, MZmine, MAVEN, and MetaboAnalyst, as well as commercial software, have been developed to facilitate data processing. However, researchers are challenged by the inevitable and unconquerable yields of numerous false-positive peaks, and human errors while manually removing such false peaks. Even with continuous improvements of data processing tools, there can still be many mistakes generated during data preprocessing. In addition, many data preprocessing software exist, and every tool has its own advantages and disadvantages. Thereby, a researcher needs to judge what kind of software or tools to choose that most suit their vendor proprietary formats and goal of downstream analysis. Here, we provided a brief introduction of the general steps of raw MS data processing, and properties of automated data processing tools. Then, characteristics of mainly free data preprocessing software were summarized for researchers’ consideration in conducting metabolomics study.

[1]  W. Pan,et al.  SMART: Statistical Metabolomics Analysis-An R Tool. , 2016, Analytical chemistry.

[2]  Arjen Lommen,et al.  MetAlign 3.0: performance enhancement by efficient use of advances in computer hardware , 2011, Metabolomics.

[3]  Matej Oresic,et al.  MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data , 2010, BMC Bioinformatics.

[4]  M. Lai,et al.  Newcastle University Eprints Date Deposited: 21 the Brain Lipidomes of Subcortical Ischemic Vascular Dementia and Mixed Dementia , 2022 .

[5]  Ralf Tautenhahn,et al.  A view from above: cloud plots to visualize global metabolomic data. , 2013, Analytical chemistry.

[6]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[7]  Arjen Lommen,et al.  MetAlign: interface-driven, versatile metabolomics tool for hyphenated full-scan mass spectrometry data preprocessing. , 2009, Analytical chemistry.

[8]  Gary Siuzdak,et al.  Bioinformatics: The Next Frontier of Metabolomics , 2014, Analytical chemistry.

[9]  M. Wenk,et al.  Non‐targeted profiling of lipids during kainate‐induced neuronal injury , 2006, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[10]  Markus R. Wenk,et al.  Extensive characterization of human tear fluid collected using different techniques unravels the presence of novel lipid amphiphiles1[S] , 2014, Journal of Lipid Research.

[11]  Rainer Breitling,et al.  msCompare: A Framework for Quantitative Analysis of Label-free LC-MS Data for Comparative Candidate Biomarker Studies* , 2012, Molecular & Cellular Proteomics.

[12]  G. Shui,et al.  Lipidomics, en route to accurate quantitation. , 2017, Biochimica et biophysica acta. Molecular and cell biology of lipids.

[13]  Karl-Heinz Engel,et al.  A methodology for automated comparative analysis of metabolite profiling data , 2003 .

[14]  Jiuming He,et al.  In Situ Hydrogel Conditioning of Tissue Samples To Enhance the Drug's Sensitivity in Ambient Mass Spectrometry Imaging. , 2017, Analytical chemistry.

[15]  B. W. Wright,et al.  High-speed peak matching algorithm for retention time alignment of gas chromatographic data for chemometric analysis. , 2003, Journal of chromatography. A.

[16]  Hans-Peter Deigner,et al.  Targeted Metabolomics for Biomarker Discovery , 2010 .

[17]  Jiaqi Wang,et al.  Data from identification of diagnostic biomarkers and metabolic pathway shifts of heat-stressed lactating dairy cows , 2015, Data in brief.

[18]  Rolf Danielsson,et al.  Matched filtering with background suppression for improved quality of base peak chromatograms and mass spectra in liquid chromatography - mass spectrometry , 2002 .

[19]  Zhentian Lei,et al.  MET-XAlign: a metabolite cross-alignment tool for LC/MS-based comparative metabolomics. , 2015, Analytical chemistry.

[20]  Joshua D Rabinowitz,et al.  Metabolomic analysis and visualization engine for LC-MS data. , 2010, Analytical chemistry.

[21]  Mir Henglin,et al.  Visualization, Quantification, and Alignment of Spectral Drift in Population Scale Untargeted Metabolomics Data. , 2017, Analytical chemistry.

[22]  Nicola Zamboni,et al.  Defining the metabolome: size, flux, and regulation. , 2015, Molecular cell.

[23]  Gek Huey Chua,et al.  Biological relevance of fatty acyl heterogeneity to the neural membrane dynamics of Rhesus macaques during normative aging , 2016, Oncotarget.

[24]  C. A. Hastings,et al.  New algorithms for processing and peak detection in liquid chromatography/mass spectrometry data. , 2002, Rapid communications in mass spectrometry : RCM.

[25]  P. Wong,et al.  Mfsd2a is a transporter for the essential omega-3 fatty acid docosahexaenoic acid , 2014, Nature.

[26]  Matej Oresic,et al.  MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data , 2006, Bioinform..

[27]  Li Zhang,et al.  Data preprocessing method for liquid chromatography-mass spectrometry based metabolomics. , 2012, Analytical chemistry.

[28]  R. Deberardinis,et al.  Mutations in mitochondrial enzyme GPT2 cause metabolic dysfunction and neurological disease with developmental and progressive features , 2016, Proceedings of the National Academy of Sciences.

[29]  Jianguo Xia,et al.  Web-based inference of biological patterns, functions and pathways from metabolomic data using MetaboAnalyst , 2011, Nature Protocols.

[30]  Zhentian Lei,et al.  MET-COFEA: a liquid chromatography/mass spectrometry data processing platform for metabolite compound feature extraction and annotation. , 2014, Analytical chemistry.

[31]  Jiaqi Wang,et al.  Identification of diagnostic biomarkers and metabolic pathway shifts of heat-stressed lactating dairy cows. , 2015, Journal of proteomics.

[32]  K. Pethe,et al.  Sensitive profiling of chemically diverse bioactive lipidss⃞ Published, JLR Papers in Press, June 12, 2007. , 2007, Journal of Lipid Research.

[33]  J. Rabinowitz,et al.  A branched chain amino acid metabolite drives vascular transport of fat and causes insulin resistance , 2016, Nature Medicine.

[34]  E. C. Chua,et al.  Extensive diversity in circadian regulation of plasma lipids and evidence for different circadian metabolic phenotypes in humans , 2013, Proceedings of the National Academy of Sciences.

[35]  G. Shui,et al.  Metabolomics, a Powerful Tool for Agricultural Research , 2016, International journal of molecular sciences.

[36]  Gary Siuzdak,et al.  Liquid chromatography quadrupole time-of-flight mass spectrometry characterization of metabolites guided by the METLIN database , 2013, Nature Protocols.

[37]  G. Siuzdak,et al.  XCMS2: processing tandem mass spectrometry data for metabolite identification and structural characterization. , 2008, Analytical chemistry.

[38]  Jerzy Adamski,et al.  Interlaboratory Reproducibility of a Targeted Metabolomics Platform for Analysis of Human Serum and Plasma. , 2017, Analytical chemistry.

[39]  David S. Wishart,et al.  MetaboAnalyst 3.0—making metabolomics more meaningful , 2015, Nucleic Acids Res..

[40]  G. Shui,et al.  Sequestration of polyunsaturated fatty acids in membrane phospholipids of Caenorhabditis elegans dauer larva attenuates eicosanoid biosynthesis for prolonged survival , 2017, Redox biology.

[41]  Gek Huey Chua,et al.  Mycolic acids as diagnostic markers for tuberculosis case detection in humans and drug efficacy in mice , 2012, EMBO molecular medicine.

[42]  Zhiwei Zhou,et al.  MetCCS predictor: a web server for predicting collision cross‐section values of metabolites in ion mobility‐mass spectrometry based metabolomics , 2017, Bioinform..