When target-decoy false discovery rate estimations are inaccurate and how to spot instances.

To address problems with estimating the reliability of proteomic search engine results from mass spectrometry fragmentation data, the use of target-decoy database searching has become the de facto approach for estimating a false discovery rate. Several articles have been written about the effects of different ways of creating the decoy database, effects of the search engine scoring, or effects of search parameters on whether this approach provides an accurate estimate, not all agreeing with each other's conclusions. Hence, there may be some confusion about how effective this approach is and how broadly it can be applied. Although it is generally very effective, in this article I will try to emphasize some of the pitfalls and dangers of using the target-decoy approach and will indicate tell-tale signs that something may be amiss. This information will hopefully help researchers become more astute in their assessment of search results.

[1]  A. Nesvizhskii A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. , 2010, Journal of proteomics.

[2]  P. Pevzner,et al.  Target-Decoy Approach and False Discovery Rate: When Things May Go Wrong , 2011, Journal of the American Society for Mass Spectrometry.

[3]  Peter R Baker,et al.  Improving Software Performance for Peptide Electron Transfer Dissociation Data Analysis by Implementation of Charge State- and Sequence-Dependent Scoring* , 2010, Molecular & Cellular Proteomics.

[4]  Leonard J Foster Bromenshenk et al (PLoS One, 2011, 5(10):e13181) have claimed to have found peptides from an invertebrate iridovirus in bees. , 2012, Molecular & cellular proteomics : MCP.

[5]  Bret Cooper,et al.  The problem with peptide presumption and the downfall of target-decoy false discovery rates. , 2012, Analytical chemistry.

[6]  S. Carr,et al.  Reporting Protein Identification Data , 2006, Molecular & Cellular Proteomics.

[7]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[8]  Leonard J. Foster,et al.  Interpretation of Data Underlying the Link Between Colony Collapse Disorder (CCD) and an Invertebrate Iridescent Virus , 2011, Molecular & Cellular Proteomics.

[9]  William Stafford Noble,et al.  Improvements to the percolator algorithm for Peptide identification from shotgun proteomics data sets. , 2009, Journal of proteome research.

[10]  Robert J. Chalkley,et al.  The Effect of Using an Inappropriate Protein Database for Proteomic Data Analysis , 2011, PloS one.