Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery.

Preprocessing software, which converts large instrumental data sets into a manageable format for data analysis, is crucial for the discovery of chemical signatures in metabolomics, chemical forensics, and other signature-focused disciplines. Here, four freely available and published preprocessing tools known as MetAlign, MZmine, SpectConnect, and XCMS were evaluated for impurity profiling using nominal mass GC/MS data and accurate mass LC/MS data. Both data sets were previously collected from the analysis of replicate samples from multiple stocks of a nerve-agent precursor and method blanks. Parameters were optimized for each of the four tools for the untargeted detection, matching, and cataloging of chromatographic peaks from impurities present in the stock samples. The peak table generated by each preprocessing tool was analyzed to determine the number of impurity components detected in all replicate samples per stock and absent in the method blanks. A cumulative set of impurity components was then generated using all available peak tables and used as a reference to calculate the percent of component detections for each tool, in which 100% indicated the detection of every known component present in a stock. For the nominal mass GC/MS data, MetAlign had the most component detections followed by MZmine, SpectConnect, and XCMS with detection percentages of 83, 60, 47, and 41%, respectively. For the accurate mass LC/MS data, the order was MetAlign, XCMS, and MZmine with detection percentages of 80, 45, and 35%, respectively. SpectConnect did not function for the accurate mass LC/MS data. Larger detection percentages were obtained by combining the top performer with at least one of the other tools such as 96% by combining MetAlign with MZmine for the GC/MS data and 93% by combining MetAlign with XCMS for the LC/MS data. In terms of quantitative performance, the reported peak intensities from each tool had averaged absolute biases (relative to peak intensities obtained using instrument software) of 41, 4.4, 1.3 and 1.3% for SpectConnect, MetAlign, XCMS, and MZmine, respectively, for the GC/MS data. For the LC/MS data, the averaged absolute biases were 22, 4.5, and 3.1% for MetAlign, MZmine, and XCMS, respectively. In summary, MetAlign performed the best in terms of the number of component detections; however, more than one preprocessing tool should be considered to avoid missing impurities or other trace components as potential chemical signatures.

[1]  Arjen Lommen,et al.  MetAlign 3.0: performance enhancement by efficient use of advances in computer hardware , 2011, Metabolomics.

[2]  Chun Wei Yap,et al.  Comparative evaluation of software for retention time alignment of gas chromatography/time-of-flight mass spectrometry-based metabonomic data. , 2010, Journal of chromatography. A.

[3]  Ronald J. Moore,et al.  Signature-discovery approach for sample matching of a nerve-agent precursor using liquid chromatography-mass spectrometry, XCMS, and chemometrics. , 2010, Analytical chemistry.

[4]  D. Kell,et al.  Comparative evaluation of software for deconvolution of metabolomics data based on GC-TOF-MS , 2007 .

[5]  Laxman Yetukuri,et al.  Algorithms and tools for the preprocessing of LC–MS metabolomics data , 2011 .

[6]  Antoine H P America,et al.  Comparative LC‐MS: A landscape of peaks and valleys , 2008, Proteomics.

[7]  Elena Marchiori,et al.  Tools for computational processing of LC-MS datasets: A user's perspective , 2007, Comput. Methods Programs Biomed..

[8]  Natalie I. Tasman,et al.  A Cross-platform Toolkit for Mass Spectrometry and Proteomics , 2012, Nature Biotechnology.

[9]  G. M. Mong,et al.  Impurity profiling to match a nerve agent to its precursor source for chemical forensics applications. , 2011, Analytical chemistry.

[10]  A. Zhang,et al.  Recent advances in metabolomics in neurological disease, and future perspectives , 2013, Analytical and Bioanalytical Chemistry.

[11]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[12]  Chao Yang,et al.  Comparison of public peak detection algorithms for MALDI mass spectrometry data analysis , 2009, BMC Bioinformatics.

[13]  Matej Oresic,et al.  Processing methods for differential analysis of LC/MS profile data , 2005, BMC Bioinformatics.

[14]  M. Orešič,et al.  Data processing for mass spectrometry-based metabolomics. , 2007, Journal of chromatography. A.

[15]  Mark P. Styczynski,et al.  Systematic identification of conserved metabolites in GC/MS data for metabolomics and biomarker discovery. , 2007, Analytical chemistry.

[16]  Arjen Lommen,et al.  MetAlign: interface-driven, versatile metabolomics tool for hyphenated full-scan mass spectrometry data preprocessing. , 2009, Analytical chemistry.

[17]  Steffen Neumann,et al.  Critical assessment of alignment procedures for LC-MS proteomics and metabolomics measurements , 2008, BMC Bioinformatics.