Time alignment algorithms based on selected mass traces for complex LC-MS data.

Time alignment of complex LC-MS data remains a challenge in proteomics and metabolomics studies. This work describes modifications of the Dynamic Time Warping (DTW) and the Parametric Time Warping (PTW) algorithms that improve the alignment quality for complex, highly variable LC-MS data sets. Regular DTW or PTW use one-dimensional profiles such as the Total Ion Chromatogram (TIC) or Base Peak Chromatogram (BPC) resulting in correct alignment if the signals have a relatively simple structure. However, when aligning the TICs of chromatograms from complex mixtures with large concentration variability such as serum or urine, both algorithms often lead to misalignment of peaks and thus incorrect comparisons in the subsequent statistical analysis. This is mainly due to the fact that compounds with different m/z values but similar retention times are not considered separately but confounded in the benefit function of the algorithms using only one-dimensional information. Thus, it is necessary to treat the information of different mass traces separately in the warping function to ensure that compounds having the same m/z value and retention time are aligned to each other. The Component Detection Algorithm (CODA) is widely used to calculate the quality of an LC-MS mass trace. By combining CODA with the warping algorithms of DTW or PTW (DTW-CODA or PTW-CODA), we include only high quality mass traces measured by CODA in the benefit function. Our results show that using several CODA selected high quality mass traces in DTW-CODA and PTW-CODA significantly improves the alignment quality of three different, highly complex LC-MS data sets. Moreover, DTW-CODA leads to better preservation of peak shape as compared to the original DTW-TIC algorithm, which often suffers from a substantial peak shape distortion. Our results show that combination of CODA selected mass traces with different time alignment algorithm is a general principle that provide accurate alignment for highly complex samples with large concentration variability.

[1]  Robert E. Synovec,et al.  Sequential chromatogram ratio technique: evaluation of the effects of retention time precision, adsorption isotherm linearity, and detector linearity on qualitative and quantitative analysis , 1992 .

[2]  Joachim M. Buhmann,et al.  Semi-supervised LC/MS alignment for differential proteomics , 2006, ISMB.

[3]  Erik Alm,et al.  The correspondence problem for metabonomics datasets , 2009, Analytical and bioanalytical chemistry.

[4]  Ewa Szymańska,et al.  Evaluation of different warping methods for the analysis of CE profiles of urinary nucleosides , 2007, Electrophoresis.

[5]  T. Shaler,et al.  Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. , 2003, Analytical chemistry.

[6]  J. Carstensen,et al.  Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping , 1998 .

[7]  Serge Rezzi,et al.  Alignment using variable penalty dynamic time warping. , 2009, Analytical chemistry.

[8]  Age K. Smilde,et al.  Optimized time alignment algorithm for LC-MS data: correlation optimized warping using component detection algorithm-selected mass chromatograms. , 2008, Analytical chemistry.

[9]  M. MacCoss,et al.  Label-free comparative analysis of proteomics mixtures using chromatographic alignment of high-resolution muLC-MS data. , 2008, Analytical chemistry.

[10]  M Daszykowski,et al.  A comparison of three algorithms for chromatograms alignment. , 2006, Journal of chromatography. A.

[11]  Hua Tang,et al.  A statistical method for chromatographic alignment of LC-MS data. , 2007, Biostatistics.

[12]  B. W. Wright,et al.  Unsupervised parameter optimization for automated retention time alignment of severely shifted gas chromatographic data using the piecewise alignment algorithm. , 2007, Journal of chromatography. A.

[13]  E. Marcotte,et al.  Chromatographic alignment of ESI-LC-MS proteomics data sets by ordered bijective interpolated warping. , 2006, Analytical chemistry.

[14]  K. Markides,et al.  Chromatographic alignment by warping and dynamic programming as a pre-processing tool for PARAFAC modelling of liquid chromatography-mass spectrometry data. , 2002, Journal of chromatography. A.

[15]  A. Smilde,et al.  Dynamic time warping of spectroscopic BATCH data , 2003 .

[16]  Ole Andersen,et al.  Multivariate statistical methods for evaluating biodegradation of mineral oil. , 2005, Journal of chromatography. A.

[17]  Benno Schwikowski,et al.  Signal Maps for Mass Spectrometry-based Comparative Proteomics* , 2006, Molecular & Cellular Proteomics.

[18]  Min Zhang,et al.  Two-dimensional correlation optimized warping algorithm for aligning GC x GC-MS data. , 2008, Analytical chemistry.

[19]  R. Jansen,et al.  Analysis of human serum by liquid chromatography-mass spectrometry: improved sample preparation and data analysis. , 2006, Journal of chromatography. A.

[20]  John J. Thaden,et al.  An iterative block-shifting approach to retention time alignment that preserves the shape and area of gas chromatography-mass spectrometry peaks , 2008, BMC Bioinformatics.

[21]  Joachim M. Buhmann,et al.  Time-series alignment by non-negative multiple generalized canonical correlation analysis , 2007, BMC Bioinformatics.

[22]  R. Cooks,et al.  Mass shifts and local space charge effects observed in the quadrupole ion trap at higher resolution , 1995 .

[23]  D. Massart,et al.  A comparison of two algorithms for warping of analytical signals , 2002 .

[24]  P. A. Taylor,et al.  Synchronization of batch trajectories using dynamic time warping , 1998 .

[25]  H. Hollema,et al.  Clinical value of routine serum squamous cell carcinoma antigen in follow-up of patients with early-stage cervical cancer. , 2001, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[26]  Frans van den Berg,et al.  Correlation optimized warping and dynamic time warping as preprocessing methods for chromatographic data , 2004 .

[27]  Frank Suits,et al.  Two-dimensional method for time aligning liquid chromatography-mass spectrometry data. , 2008, Analytical chemistry.

[28]  Willem Windig,et al.  Fast interpretation of complex LC/MS data using chemometrics , 2001 .

[29]  P. Eilers Parametric time warping. , 2004, Analytical chemistry.

[30]  Yongyi Mao,et al.  Informatics Platform for Global Proteomic Profiling and Biomarker Discovery Using Liquid Chromatography-Tandem Mass Spectrometry*S , 2004, Molecular & Cellular Proteomics.

[31]  R. Whittal,et al.  Interferences and contaminants encountered in modern mass spectrometry. , 2008, Analytica chimica acta.