Elimination of Redundant Protein Identifications in High Throughput Proteomics

Tandem mass spectrometry followed by data base search is the preferred method for protein identification in high throughput proteomics. However, standard analysis methods give rise to highly redundant lists of proteins with many proteins identified by the same sets of peptides. In essence, this is a list of all proteins that might be present in the sample. Here we present an algorithm that eliminates redundancy and determines the minimum number of proteins needed to explain the peptides observed. We demonstrate that application of the algorithm results in a significantly smaller set of proteins and greatly reduces the number of "shared" peptides

[1]  Z. Bencsath-Makkai,et al.  CellMapBase-an information system supporting high-throughput proteomics for the Cell Map project , 2003, Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No.03CH37439).

[2]  Alexander W Bell,et al.  Tandem MS analysis of brain clathrin-coated vesicles reveals their critical involvement in synaptic vesicle recycling. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[3]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[4]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[5]  J. Yates,et al.  A model for random sampling and estimation of relative protein abundance in shotgun proteomics. , 2004, Analytical chemistry.

[6]  R. Aebersold,et al.  A statistical model for identifying proteins by tandem mass spectrometry. , 2003, Analytical chemistry.

[7]  M. Mann,et al.  Unbiased quantitative proteomics of lipid rafts reveals high specificity for signaling factors , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[8]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[9]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.