PAnalyzer: A software tool for protein inference in shotgun proteomics

BackgroundProtein inference from peptide identifications in shotgun proteomics must deal with ambiguities that arise due to the presence of peptides shared between different proteins, which is common in higher eukaryotes. Recently data independent acquisition (DIA) approaches have emerged as an alternative to the traditional data dependent acquisition (DDA) in shotgun proteomics experiments. MSEis the term used to name one of the DIA approaches used in QTOF instruments. MSEdata require specialized software to process acquired spectra and to perform peptide and protein identifications. However the software available at the moment does not group the identified proteins in a transparent way by taking into account peptide evidence categories. Furthermore the inspection, comparison and report of the obtained results require tedious manual intervention. Here we report a software tool to address these limitations for MSEdata.ResultsIn this paper we present PAnalyzer, a software tool focused on the protein inference process of shotgun proteomics. Our approach considers all the identified proteins and groups them when necessary indicating their confidence using different evidence categories. PAnalyzer can read protein identification files in the XML output format of the ProteinLynx Global Server (PLGS) software provided by Waters Corporation for their MSEdata, and also in the mzIdentML format recently standardized by HUPO-PSI. Multiple files can also be read simultaneously and are considered as technical replicates. Results are saved to CSV, HTML and mzIdentML (in the case of a single mzIdentML input file) files. An MSEanalysis of a real sample is presented to compare the results of PAnalyzer and ProteinLynx Global Server.ConclusionsWe present a software tool to deal with the ambiguities that arise in the protein inference process. Key contributions are support for MSEdata analysis by ProteinLynx Global Server and technical replicates integration. PAnalyzer is an easy to use multiplatform and free software tool.

[1]  Dan Golick,et al.  Database searching and accounting of multiplexed precursor and product ion spectra from the data independent analysis of simple and complex peptide mixtures , 2009, Proteomics.

[2]  K. Resing,et al.  IsoformResolver: A Peptide-Centric Algorithm for Protein Inference , 2011, Journal of proteome research.

[3]  Alexey I Nesvizhskii,et al.  Interpretation of Shotgun Proteomic Data , 2005, Molecular & Cellular Proteomics.

[4]  Martin Eisenacher,et al.  The mzIdentML Data Standard for Mass Spectrometry-Based Proteomics Results , 2012, Molecular & Cellular Proteomics.

[5]  M. Gorenstein,et al.  Absolute Quantification of Proteins by LCMSE , 2006, Molecular & Cellular Proteomics.

[6]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[7]  R. Aebersold,et al.  A statistical model for identifying proteins by tandem mass spectrometry. , 2003, Analytical chemistry.

[8]  C. Ahrens,et al.  PeptideClassifier for protein inference and targeted quantitative proteomics , 2010, Nature Biotechnology.

[9]  Zengyou He,et al.  Protein inference: a review , 2012, Briefings Bioinform..

[10]  M. Gorenstein,et al.  The detection, correlation, and comparison of peptide precursor and product ions from data independent LC‐MS with data dependant LC‐MS/MS , 2009, Proteomics.

[11]  J. Yates,et al.  Large-scale analysis of the yeast proteome by multidimensional protein identification technology , 2001, Nature Biotechnology.

[12]  H. Rehrauer,et al.  Deterministic protein inference for shotgun proteomics data provides new insights into Arabidopsis pollen development and function. , 2009, Genome research.