Computational analysis of shotgun proteomics data.

Proteomics technology is progressing at an incredible rate. The latest generation of tandem mass spectrometers can now acquire tens of thousands of fragmentation spectra in a matter of hours. Furthermore, quantitative proteomics methods have been developed that incorporate a stable isotope-labeled internal standard for every peptide within a complex protein mixture for the measurement of relative protein abundances. These developments have opened the doors for 'shotgun' proteomics, yet have also placed a burden on the computational approaches that manage the data. With each new method that is developed, the quantity of data that can be derived from a single experiment increases. To deal with this increase, new computational approaches are being developed to manage the data and assess false positives. This review discusses current approaches for analyzing proteomics data by mass spectrometry and identifies present computational limitations and bottlenecks.

[1]  P. A. Payne,et al.  Approaches to the improvement of quantitative precision in selected ion monitoring: High resolution applications , 1984 .

[2]  J. Yates,et al.  A hypergeometric probability model for protein identification and validation using tandem mass spectral data and protein sequence databases. , 2003, Analytical chemistry.

[3]  William Stafford Noble,et al.  A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: support vector machine classification of peptide MS/MS spectra and SEQUEST scores. , 2003, Journal of proteome research.

[4]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[5]  R. Aebersold,et al.  Automated statistical analysis of protein abundance ratios from data generated by stable-isotope dilution and tandem mass spectrometry. , 2003, Analytical chemistry.

[6]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[7]  J. Yates,et al.  Organellar proteomics reveals Golgi arginine dimethylation. , 2004, Molecular biology of the cell.

[8]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[9]  Steven P Gygi,et al.  Intensity-based protein identification by machine learning from a library of tandem mass spectra , 2004, Nature Biotechnology.

[10]  J. Yates,et al.  Large-scale analysis of the yeast proteome by multidimensional protein identification technology , 2001, Nature Biotechnology.

[11]  J. Yates,et al.  Probability-based validation of protein identifications using a modified SEQUEST algorithm. , 2002, Analytical chemistry.

[12]  J. Yates,et al.  A correlation algorithm for the automated quantitative analysis of shotgun proteomics data. , 2003, Analytical chemistry.

[13]  The statistical significance of protein identification results as a function of the number of protein sequences searched. , 2004, Journal of proteome research.

[14]  J. Yates,et al.  Method to compare collision-induced dissociation spectra of peptides: potential for library searching and subtractive analysis. , 1998, Analytical chemistry.

[15]  K. Resing,et al.  Improving reproducibility and sensitivity in identifying human proteins by shotgun proteomics. , 2004, Analytical chemistry.

[16]  P. Højrup,et al.  Rapid identification of proteins by peptide-mass fingerprinting , 1993, Current Biology.

[17]  J. Yates,et al.  Statistical models for protein validation using tandem mass spectral data and protein amino acid sequence databases. , 2004, Analytical chemistry.

[18]  Roger E. Moore,et al.  Qscore: An algorithm for evaluating SEQUEST database search results , 2002, Journal of the American Society for Mass Spectrometry.

[19]  R. Aebersold,et al.  ProbID: A probabilistic algorithm to identify peptides through sequence database searching using tandem mass spectral data , 2002, Proteomics.

[20]  D. Creasy,et al.  Error tolerant searching of uninterpreted tandem mass spectrometry data , 2002, Proteomics.

[21]  J. Yates,et al.  A method for the comprehensive proteomic analysis of membrane proteins , 2003, Nature Biotechnology.

[22]  Jacob D. Jaffe,et al.  Proteogenomic mapping as a complementary method to perform genome annotation , 2004, Proteomics.

[23]  R. Aebersold,et al.  Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry , 2001, Nature Biotechnology.

[24]  J. Yates,et al.  DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics. , 2002, Journal of proteome research.