aLFQ: an R-package for estimating absolute protein quantities from label-free LC-MS/MS proteomics data

Motivation: The determination of absolute quantities of proteins in biological samples is necessary for multiple types of scientific inquiry. While relative quantification has been commonly used in proteomics, few proteomic datasets measuring absolute protein quantities have been reported to date. Various technologies have been applied using different types of input data, e.g. ion intensities or spectral counts, as well as different absolute normalization strategies. To date, a user-friendly and transparent software supporting large-scale absolute protein quantification has been lacking. Results: We present a bioinformatics tool, termed aLFQ, which supports the commonly used absolute label-free protein abundance estimation methods (TopN, iBAQ, APEX, NSAF and SCAMPI) for LC-MS/MS proteomics data, together with validation algorithms enabling automated data analysis and error estimation. Availability and implementation: aLFQ is written in R and freely available under the GPLv3 from CRAN (http://www.cran.r-project.org). Instructions and example data are provided in the R-package. The raw data can be obtained from the PeptideAtlas raw data repository (PASS00321). Contact: lars.malmstroem@imsb.biol.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[2]  R. Aebersold,et al.  A statistical model for identifying proteins by tandem mass spectrometry. , 2003, Analytical chemistry.

[3]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[4]  R. Aebersold,et al.  A uniform proteomics MS/MS analysis platform utilizing open XML file formats , 2005, Molecular systems biology.

[5]  M. Gorenstein,et al.  Absolute Quantification of Proteins by LCMSE , 2006, Molecular & Cellular Proteomics.

[6]  Michael K. Coleman,et al.  Statistical analysis of membrane proteome expression changes in Saccharomyces cerevisiae. , 2006, Journal of proteome research.

[7]  D. Tabb,et al.  MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. , 2007, Journal of proteome research.

[8]  E. Marcotte,et al.  Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation , 2007, Nature Biotechnology.

[9]  Knut Reinert,et al.  OpenMS – An open-source software framework for mass spectrometry , 2008, BMC Bioinformatics.

[10]  Ruedi Aebersold,et al.  Building consensus spectral libraries for peptide identification in proteomics , 2008, Nature Methods.

[11]  Minoru Kanehisa,et al.  AAindex: amino acid index database, progress report 2008 , 2007, Nucleic Acids Res..

[12]  E. Marcotte,et al.  Calculating absolute and relative protein abundance from mass spectrometry-based protein expression data , 2008, Nature Protocols.

[13]  J. Garin,et al.  Isotope dilution strategies for absolute quantitative proteomics. , 2009, Journal of proteomics.

[14]  R. Aebersold,et al.  Proteome-wide cellular protein concentrations of the human pathogen Leptospira interrogans , 2009, Nature.

[15]  Brendan MacLean,et al.  Skyline: an open source document editor for creating and analyzing targeted proteomics experiments , 2010, Bioinform..

[16]  Natalie I. Tasman,et al.  A guided tour of the Trans‐Proteomic Pipeline , 2010, Proteomics.

[17]  M. Selbach,et al.  Global quantification of mammalian gene expression control , 2011, Nature.

[18]  Ruedi Aebersold,et al.  Estimation of Absolute Protein Quantities of Unlabeled Samples by Selected Reaction Monitoring Mass Spectrometry , 2011, Molecular & Cellular Proteomics.

[19]  Henry H. N. Lam,et al.  Absolute quantification of microbial proteomes at different states by directed mass spectrometry , 2011, Molecular systems biology.

[20]  Adamandia Kapopoulou,et al.  TubercuList--10 years after. , 2011, Tuberculosis.

[21]  Natalie I. Tasman,et al.  iProphet: Multi-level Integrative Analysis of Shotgun Proteomic Data Improves Peptide and Protein Identification Rates and Error Estimates* , 2011, Molecular & Cellular Proteomics.

[22]  R. Aebersold,et al.  mProphet: automated data processing and statistical validation for large-scale SRM experiments , 2011, Nature Methods.

[23]  Alexey I Nesvizhskii,et al.  Abacus: A computational tool for extracting and pre‐processing spectral count data for label‐free quantitative proteomic analysis , 2011, Proteomics.

[24]  Alexander Schmidt,et al.  Critical assessment of proteome‐wide label‐free absolute abundance estimation strategies , 2013, Proteomics.

[25]  Ruedi Aebersold,et al.  Statistical Approach to Protein Quantification* , 2013, Molecular & Cellular Proteomics.

[26]  Andreas Quandt,et al.  An automated pipeline for high-throughput label-free quantitative proteomics. , 2013, Journal of proteome research.

[27]  Ben C. Collins,et al.  A tool for the automated, targeted analysis of data-independent acquisition MS-data: OpenSWATH , 2014 .