The Effect of Using an Inappropriate Protein Database for Proteomic Data Analysis

A recent study by Bromenshenk et al., published in PLoS One (2010), used proteomic analysis to identify peptides purportedly of Iridovirus and Nosema origin; however the validity of this finding is controversial. We show here through re-analysis of a subset of this data that many of the spectra identified by Bromenshenk et al. as deriving from Iridovirus and Nosema proteins are actually products from Apis mellifera honey bee proteins. We find no reliable evidence that proteins from Iridovirus and Nosema are present in the samples that were re-analyzed. This article is also intended as a learning exercise for illustrating some of the potential pitfalls of analysis of mass spectrometry proteomic data and to encourage authors to observe MS/MS data reporting guidelines that would facilitate recognition of analysis problems during the review process.

[1]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[2]  Rabih E. Jabbour,et al.  Iridovirus and Microsporidian Linked to Honey Bee Colony Decline , 2010, PloS one.

[3]  L. Foster Interpretation of data underlying the link between CCD and an invertebrate iridescent virus. , 2011, Molecular & cellular proteomics : MCP.

[4]  Chris F. Taylor,et al.  Proteomic Data Exchange and Storage , 2007 .

[5]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[6]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[7]  Leonard J. Foster,et al.  Interpretation of Data Underlying the Link Between Colony Collapse Disorder (CCD) and an Invertebrate Iridescent Virus , 2011, Molecular & Cellular Proteomics.

[8]  James A Hill,et al.  Proteomics FASTA Archive and Reference Resource , 2008, Proteomics.

[9]  Chris F. Taylor,et al.  Proteomic data exchange and storage: the need for common standards and public repositories. , 2007, Methods in molecular biology.

[10]  Steven P Gygi,et al.  Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry , 2007, Nature Methods.

[11]  A. Nesvizhskii A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. , 2010, Journal of proteomics.