Informed-Proteomics: Open Source Software Package for Top-down Proteomics

Top-down proteomics, the analysis of intact proteins in their endogenous form, preserves valuable information about post-translation modifications, isoforms and proteolytic processing. The quality of top-down liquid chromatography–tandem MS (LC-MS/MS) data sets is rapidly increasing on account of advances in instrumentation and sample-processing protocols. However, top-down mass spectra are substantially more complex than conventional bottom-up data. New algorithms and software tools for confident proteoform identification and quantification are needed. Here we present Informed-Proteomics, an open-source software suite for top-down proteomics analysis that consists of an LC-MS feature-finding algorithm, a database search algorithm, and an interactive results viewer. We compare our tool with several other popular tools using human-in-mouse xenograft luminal and basal breast tumor samples that are known to have significant differences in protein abundance based on bottom-up analysis.

[1]  P. Pevzner,et al.  Identification of ultramodified proteins using top-down tandem mass spectra. , 2013, Journal of proteome research.

[2]  Jeffrey W. Smith,et al.  Mass Spectrometry-Based Label-Free Quantitative Proteomics , 2009, Journal of biomedicine & biotechnology.

[3]  David Fenyö,et al.  Integrated Bottom-Up and Top-Down Proteomics of Patient-Derived Breast Tumor Xenografts* , 2015, Molecular & Cellular Proteomics.

[4]  B. Garcia What does the future hold for top down mass spectrometry? , 2010, Journal of the American Society for Mass Spectrometry.

[5]  Haiyan Tan,et al.  JUMP: A Tag-based Database Search Tool for Peptide Identification with High Sensitivity and Accuracy* , 2014, Molecular & Cellular Proteomics.

[6]  P. Pevzner,et al.  Deconvolution and Database Search of Complex Tandem Mass Spectra of Intact Proteins , 2010, Molecular & Cellular Proteomics.

[7]  Hao Chi,et al.  pTop 1.0: A High-Accuracy and High-Efficiency Search Engine for Intact Protein Identification. , 2016, Analytical chemistry.

[8]  Richard D. LeDuc,et al.  Mapping Intact Protein Isoforms in Discovery Mode Using Top Down Proteomics , 2011, Nature.

[9]  F. McLafferty,et al.  Automated reduction and interpretation of , 2000, Journal of the American Society for Mass Spectrometry.

[10]  Li Ding,et al.  Endocrine-therapy-resistant ESR1 variants revealed by genomic characterization of breast-cancer-derived xenografts. , 2013, Cell reports.

[11]  Qiang Kou,et al.  A new scoring function for top-down spectral deconvolution , 2014, BMC Genomics.

[12]  Yong-Bin Kim,et al.  ProSight PTM 2.0: improved protein identification and characterization for top down mass spectrometry , 2007, Nucleic Acids Res..

[13]  Yong-Bin Kim,et al.  ProSight PTM: an integrated environment for protein identification and characterization by top-down mass spectrometry , 2004, Nucleic Acids Res..

[14]  Martin Eisenacher,et al.  The mzIdentML Data Standard for Mass Spectrometry-Based Proteomics Results , 2012, Molecular & Cellular Proteomics.

[15]  Pavel A. Pevzner,et al.  Peptide sequence tags for fast database search in mass-spectrometry. , 2005 .

[16]  Ying Peng,et al.  MASH Suite Pro: A Comprehensive Software Tool for Top-Down Proteomics* , 2015, Molecular & Cellular Proteomics.

[17]  Lloyd M. Smith,et al.  Proteoform: a single term describing protein complexity , 2013, Nature Methods.

[18]  David L. Tabb,et al.  Reproducibility of Differential Proteomic Technologies in CPTAC Fractionated Xenografts , 2015, Journal of proteome research.

[19]  P. Pevzner,et al.  Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases. , 2008, Journal of proteome research.

[20]  Pavel A. Pevzner,et al.  Universal database search tool for proteomics , 2014, Nature Communications.

[21]  Dekel Tsur,et al.  Identification of post-translational modifications by blind search of mass spectra , 2005, Nature Biotechnology.

[22]  Brian T Chait,et al.  Chemistry. Mass spectrometry: bottom-up or top-down? , 2006, Science.

[23]  Ruedi Aebersold,et al.  Options and considerations when selecting a quantitative proteomics strategy , 2010, Nature Biotechnology.

[24]  B. Chait Mass Spectrometry: Bottom-Up or Top-Down? , 2006, Science.

[25]  Ying Ge,et al.  MASH Suite: A User-Friendly and Versatile Software Interface for High-Resolution Mass Spectrometry Data Interpretation and Visualization , 2014, Journal of The American Society for Mass Spectrometry.

[26]  Steven P Gygi,et al.  Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry , 2007, Nature Methods.

[27]  Qiang Kou,et al.  TopPIC: a software tool for top-down mass spectrometry-based proteoform identification and characterization , 2016, Bioinform..

[28]  M. Mann,et al.  Quantitative analysis of the intra- and inter-individual variability of the normal urinary proteome. , 2011, Journal of proteome research.

[29]  Richard D. LeDuc,et al.  New and automated MSn approaches for top-down identification of modified proteins , 2005, Journal of the American Society for Mass Spectrometry.

[30]  Joanne Bechta Dugan,et al.  A Combinatorial Approach to , 1995 .

[31]  Natalie I. Tasman,et al.  A Cross-platform Toolkit for Mass Spectrometry and Proteomics , 2012, Nature Biotechnology.

[32]  Lennart Martens,et al.  mzML—a Community Standard for Mass Spectrometry Data* , 2010, Molecular & Cellular Proteomics.

[33]  N. Kelleher,et al.  Decoding protein modifications using top-down mass spectrometry , 2007, Nature Methods.

[34]  M. Senko,et al.  Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distributions , 1995, Journal of the American Society for Mass Spectrometry.

[35]  Ying S. Ting,et al.  Protein Identification Using Top-Down Spectra* , 2012, Molecular & Cellular Proteomics.

[36]  Pavel A. Pevzner,et al.  Mutation-tolerant protein identification by mass-spectrometry , 2000, RECOMB '00.

[37]  Richard D. Smith,et al.  Advances and Challenges in Liquid Chromatography-Mass Spectrometry-based Proteomics Profiling for Clinical Applications* , 2006, Molecular & Cellular Proteomics.

[38]  C. Eyers,et al.  Top-down mass spectrometry for the analysis of combinatorial post-translational modifications. , 2013, Mass spectrometry reviews.

[39]  P. Pevzner,et al.  Interpreting top-down mass spectra using spectral alignment. , 2008, Analytical chemistry.