MultiAlign: a multiple LC-MS analysis tool for targeted omics analysis

BackgroundMultiAlign is a free software tool that aligns multiple liquid chromatography-mass spectrometry datasets to one another by clustering mass and chromatographic elution features across datasets. Applicable to both label-free proteomics and metabolomics comparative analyses, the software can be operated in several modes. For example, clustered features can be matched to a reference database to identify analytes, used to generate abundance profiles, linked to tandem mass spectra based on parent precursor masses, and culled for targeted liquid chromatography-tandem mass spectrometric analysis. MultiAlign is also capable of tandem mass spectral clustering to describe proteome structure and find similarity in subsequent sample runs.ResultsMultiAlign was applied to two large proteomics datasets obtained from liquid chromatography-mass spectrometry analyses of environmental samples. Peptides in the datasets for a microbial community that had a known metagenome were identified by matching mass and elution time features to those in an established reference peptide database. Results compared favorably with those obtained using existing tools such as VIPER, but with the added benefit of being able to trace clusters of peptides across conditions to existing tandem mass spectra. MultiAlign was further applied to detect clusters across experimental samples derived from a reactor biomass community for which no metagenome was available. Several clusters were culled for further analysis to explore changes in the community structure. Lastly, MultiAlign was applied to liquid chromatography-mass spectrometry-based datasets obtained from a previously published study of wild type and mitochondrial fatty acid oxidation enzyme knockdown mutants of human hepatocarcinoma to demonstrate its utility for analyzing metabolomics datasets.ConclusionMultiAlign is an efficient software package for finding similar analytes across multiple liquid chromatography-mass spectrometry feature maps, as demonstrated here for both proteomics and metabolomics experiments. The software is particularly useful for proteomic studies where little or no genomic context is known, such as with environmental proteomics.

[1]  Heejin Park,et al.  Isotopic peak intensity ratio based algorithm for determination of isotopic clusters and monoisotopic masses of polypeptides from high-resolution mass spectrometric data. , 2008, Analytical chemistry.

[2]  Kenneth H. Williams,et al.  Proteogenomic Monitoring of Geobacter Physiology during Stimulated Uranium Bioremediation , 2009, Applied and Environmental Microbiology.

[3]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[4]  Richard D. Smith,et al.  Systems Virology Identifies a Mitochondrial Fatty Acid Oxidation Enzyme, Dodecenoyl Coenzyme A Delta Isomerase, Required for Hepatitis C Virus Replication and Likely Pathogenesis , 2011, Journal of Virology.

[5]  Ruedi Aebersold,et al.  A Software Suite for the Generation and Comparison of Peptide Arrays from Sets of Data Collected by Liquid Chromatography-Mass Spectrometry*S , 2005, Molecular & Cellular Proteomics.

[6]  Navdeep Jaitly,et al.  VIPER: an advanced software package to support high-throughput LC-MS peptide identification , 2007, Bioinform..

[7]  Stephen J. Callister,et al.  Analysis of biostimulated microbial communities from two field experiments reveals temporal and spatial differences in proteome profiles. , 2010, Environmental science & technology.

[8]  C. Glass,et al.  A comprehensive classification system for lipids. , 2005, Journal of lipid research.

[9]  Jimmy Eng,et al.  A platform for accurate mass and time analyses of mass spectrometry data. , 2007, Journal of proteome research.

[10]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[11]  Eoin Fahy,et al.  A comprehensive classification system for lipids , 2005 .

[12]  P. Pevzner,et al.  PepNovo: de novo peptide sequencing via probabilistic network modeling. , 2005, Analytical chemistry.

[13]  Navdeep Jaitly,et al.  Decon2LS: An open-source software package for automated processing and visualization of high resolution mass spectrometry data , 2009, BMC Bioinformatics.

[14]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[15]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[16]  F. McLafferty,et al.  Automated reduction and interpretation of , 2000, Journal of the American Society for Mass Spectrometry.

[17]  Nikola Tolić,et al.  PRISM: A data management system for high‐throughput proteomics , 2006, Proteomics.

[18]  Richard D. Smith,et al.  Advances in proteomics data analysis and display using an accurate mass and time tag approach. , 2006, Mass spectrometry reviews.

[19]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[20]  Richard D. Smith,et al.  Clustering millions of tandem mass spectra. , 2008, Journal of proteome research.

[21]  Matej Oresic,et al.  MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data , 2010, BMC Bioinformatics.

[22]  Richard D. Smith,et al.  Robust algorithm for alignment of liquid chromatography-mass spectrometry analyses in an accurate mass and time tag data analysis pipeline. , 2006, Analytical chemistry.

[23]  Alan R. Dabney,et al.  A statistical method for assessing peptide identification confidence in accurate mass and time tag proteomics. , 2011, Analytical chemistry.