MassUntangler: a novel alignment tool for label-free liquid chromatography-mass spectrometry proteomic data.

Liquid chromatography-mass spectrometry (LC-MS) has become an important analytical tool for quantitative proteomics and biomarker discovery. In the label-free differential LC-MS approach computational methods are required for an accurate alignment of peaks extrapolated from the experimental raw data accounting for retention time and m/z signals intensity, which are strongly affected by sample matrix and instrumental performance. A novel procedure "MassUntangler" for pairwise alignment has been developed, relying on a pattern-based matching algorithm integrated with filtering algorithms in a multi-step approach. The procedure has been optimized employing a two-step approach. Firstly, low-complexity LC-MS data derived from the enzymatic digestion of two standard proteins have been analyzed. Then, the algorithm's performance has been evaluated by comparing the results with other achieved using state-of-the-art alignment tools. In the second step, our algorithm has been used for the alignment of high-complexity LC-MS data consisting of peptides obtained by an Escherichia coli lysate available from a public repository previously used for the comparison of other alignment tools. MassUntangler gave excellent results in terms of precision scores (from 80% to 93%) and recall scores (from 68% to 89%), showing performances similar and even better than the previous developed tools. Considering the mass spectrometry sensitivity and accuracy, this approach allows the identification and quantification of peptides present in a biological sample at femtomole level with high confidence. The procedure's capability of aligning LC-MS data previously corrected for distortion in retention time has been studied through a hybrid approach, in which MassUntangler was interfaced with the OpenMS TOPP tool MapAligner. The hybrid aligner yielded better results, showing that an integration of different bioinformatic approaches for accurate label-free LC-MS data alignment should be used.

[1]  M. Mann,et al.  Stable Isotope Labeling by Amino Acids in Cell Culture, SILAC, as a Simple and Accurate Approach to Expression Proteomics* , 2002, Molecular & Cellular Proteomics.

[2]  Matej Oresic,et al.  Processing methods for differential analysis of LC/MS profile data , 2005, BMC Bioinformatics.

[3]  D. Chelius,et al.  Quantitative profiling of proteins in complex mixtures using liquid chromatography and mass spectrometry. , 2002, Journal of proteome research.

[4]  Joachim M. Buhmann,et al.  Semi-supervised LC/MS alignment for differential proteomics , 2006, ISMB.

[5]  Antoine H P America,et al.  Comparative LC‐MS: A landscape of peaks and valleys , 2008, Proteomics.

[6]  Knut Reinert,et al.  OpenMS – An open-source software framework for mass spectrometry , 2008, BMC Bioinformatics.

[7]  Jens Stoye,et al.  ChromA: signal-based retention time alignment for chromatography–mass spectrometry data , 2009, Bioinform..

[8]  K. Parker,et al.  Multiplexed Protein Quantitation in Saccharomyces cerevisiae Using Amine-reactive Isobaric Tagging Reagents*S , 2004, Molecular & Cellular Proteomics.

[9]  Xiang Zhang,et al.  Data pre-processing in liquid chromatography-mass spectrometry-based proteomics , 2005, Bioinform..

[10]  Rong Wang,et al.  The need for a public proteomics repository , 2004, Nature Biotechnology.

[11]  Pei Wang,et al.  Bioinformatics Original Paper a Suite of Algorithms for the Comprehensive Analysis of Complex Protein Mixtures Using High-resolution Lc-ms , 2022 .

[12]  Lloyd R. Snyder,et al.  Practical HPLC method development , 1988 .

[13]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[14]  Steffen Neumann,et al.  Critical assessment of alignment procedures for LC-MS proteomics and metabolomics measurements , 2008, BMC Bioinformatics.

[15]  Rong Wang,et al.  Mass spectrometry of the M. smegmatis proteome: protein expression levels correlate with function, operons, and codon bias. , 2005, Genome research.

[16]  Age K Smilde,et al.  Time alignment algorithms based on selected mass traces for complex LC-MS data. , 2010, Journal of proteome research.

[17]  I. Mutton,et al.  “Practical HPLC method development”, 2nd edition , 1998 .

[18]  Benno Schwikowski,et al.  Alignment of LC‐MS images, with applications to biomarker discovery and protein identification , 2008, Proteomics.

[19]  E. Marcotte,et al.  Chromatographic alignment of ESI-LC-MS proteomics data sets by ordered bijective interpolated warping. , 2006, Analytical chemistry.

[20]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[21]  Ruedi Aebersold,et al.  A Software Suite for the Generation and Comparison of Peptide Arrays from Sets of Data Collected by Liquid Chromatography-Mass Spectrometry*S , 2005, Molecular & Cellular Proteomics.

[22]  Kai Stühler,et al.  Retention time alignment algorithms for LC/MS data must consider non-linear shifts , 2009, Bioinform..

[23]  Karin Hansson,et al.  Generic workflow for quality assessment of quantitative label‐free LC‐MS analysis , 2011, Proteomics.