Deconvolution using signal segmentation

Extraction of peak areas and mass spectral information from chromatography mass spectral data such as obtained in metabolomics measurements requires much effort and the quality is often subjective to the operator that handles the data at hand. In multiple file deconvolution, all samples are processed simultaneously and alignment issues are part of the modeling strategy. However, processing the total data set as a whole is an impossible task and therefore the data processing task requires segmentation. Two intertwined divide and conquer strategies are proposed. The first strategy divides the retention time axis into equal parts and the second strategy divides the total data set into a model and a prediction data set. Dividing the data into smaller segments allows us to conquer the total problem. Post processing of the resulting matrices with peak areas and mass spectra ensures that a matrix with peak areas ready for statistics and a matrix with mass spectral information ready for peak annotation is obtained. The proposed methodology is implemented within a package called TNO-DECO but can easily be implemented in other data pre-processing approaches.

[1]  Matej Oresic,et al.  Methods for the differential integrative omic analysis of plasma from a transgenic disease animal model. , 2004, Omics : a journal of integrative biology.

[2]  Mariusz Kowalczyk,et al.  A strategy for identifying differences in large series of metabolomic samples analyzed by GC/MS. , 2004, Analytical chemistry.

[3]  Arjen Lommen,et al.  MetAlign: interface-driven, versatile metabolomics tool for hyphenated full-scan mass spectrometry data preprocessing. , 2009, Analytical chemistry.

[4]  Olav M. Kvalheim,et al.  Automated curve resolution applied to data from multi-detection instruments , 2001 .

[5]  Fang Zhang,et al.  Resolution of multicomponent overlapped peaks A comparison of several curve resolution methods. , 2006, Talanta.

[6]  P. Eilers,et al.  New background correction method for liquid chromatography with diode array detection, infrared spectroscopic detection and Raman spectroscopic detection. , 2004, Journal of chromatography. A.

[7]  Helena Idborg,et al.  Multivariate approaches for efficient detection of potential metabolites from liquid chromatography/mass spectrometry data. , 2004, Rapid communications in mass spectrometry : RCM.

[8]  J. E. Glynn,et al.  Numerical Recipes: The Art of Scientific Computing , 1989 .

[9]  D. Kell,et al.  Comparative evaluation of software for deconvolution of metabolomics data based on GC-TOF-MS , 2007 .

[10]  T. Hankemeier,et al.  Microbial metabolomics with gas chromatography/mass spectrometry. , 2006, Analytical chemistry.

[11]  R. Koppmann,et al.  A new mathematical procedure to evaluate peaks in complex chromatograms. , 2005, Journal of chromatography. A.

[12]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[13]  P. Eilers Parametric time warping. , 2004, Analytical chemistry.

[14]  Romà Tauler,et al.  Multivariate Curve Resolution (MCR) from 2000: Progress in Concepts and Applications , 2006 .

[15]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[16]  W. Windig,et al.  A Noise and Background Reduction Method for Component Detection in Liquid Chromatography/Mass Spectrometry , 1996 .

[17]  R. Goodacre,et al.  Metabolic Profiling: Its Role in Biomarker Discovery and Gene Function Analysis , 2003, Springer US.

[18]  Romà Tauler,et al.  Chemometrics applied to unravel multicomponent processes and mixtures: Revisiting latest trends in multivariate resolution , 2003 .

[19]  W. Windig,et al.  Chemometric analysis of complex hyphenated data. Improvements of the component detection algorithm. , 2007, Journal of chromatography. A.

[20]  Romà Tauler,et al.  Simultaneous analysis of several spectroscopic titrations with self-modelling curve resolution , 1993 .

[21]  Sonja Peters,et al.  Parameter selection for peak alignment in chromatographic sample profiling: objective quality indicators and use of control samples , 2009, Analytical and bioanalytical chemistry.

[22]  Milan Meloun,et al.  Critical comparison of methods predicting the number of components in spectroscopic data , 2000 .

[23]  Age K Smilde,et al.  Analyzing longitudinal microbial metabolomics data. , 2009, Journal of proteome research.

[24]  Katharine M. Mullen,et al.  Global analysis of multiple gas chromatography-mass spectrometry (GC/MS) data sets : A method for resolution of co-eluting components with comparison to MCR-ALS , 2009 .

[25]  David P. Dobkin,et al.  The quickhull algorithm for convex hulls , 1996, TOMS.

[26]  B. Kowalski,et al.  Selectivity, local rank, three‐way data analysis and ambiguity in multivariate curve resolution , 1995 .

[27]  Margriet M. W. B. Hendriks,et al.  Preprocessing and exploratory analysis of chromatographic profiles of plant extracts , 2005 .

[28]  S. Stein An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data , 1999 .

[29]  Matej Oresic,et al.  MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data , 2006, Bioinform..

[30]  J. Frisvad,et al.  Full second-order chromatographic/spectrometric data matrices for automated sample identification and component analysis by non-data-reducing image analysis. , 1999, Analytical chemistry.