An evaluation for cross-species proteomics research by publicly available expressed sequence tag database search using tandem mass spectral data.

With 1383 tandem mass spectra derived from 120 individual protein spots separated by the two-dimensional (2-D) gel electrophoresis of protein samples from three different species, comparative analyses were performed by searching the Expressed Sequence Tag (EST) database (DB) and the NCBI non-redundant (nr) DB of green plants, respectively, which uses the Mascot search engine to establish a statistical basis. It was confirmed that the former could identify more peptides manually validated by de novo sequencing (DNS) from fewer species in more closely phylogenetic relationships than the latter in a statistically significant manner. Our data demonstrated that correct peptide identifications were given low Mascot scores (e.g. 6-14) and incorrect peptide identifications were given high Mascot scores (e.g. 68-83). Our data also showed that the current evaluation approaches to protein assignments are unsatisfactory because a few 'false-positive' proteins are recognized and several 'false-negative' proteins are rescued by manual validation.

[1]  Neil Hall,et al.  Analysis of the Plasmodium falciparum proteome by high-accuracy mass spectrometry , 2002, Nature.

[2]  M. Wilm,et al.  Error-tolerant identification of peptides in sequence databases by peptide sequence tags. , 1994, Analytical chemistry.

[3]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[4]  Setsuko Komatsu,et al.  Characterization of proteins responsive to gibberellin in the leaf-sheath of rice (Oryza sativa L.) seedling using proteome analysis. , 2003, Biological & pharmaceutical bulletin.

[5]  M. R. Adams,et al.  Comparative genomics of the eukaryotes. , 2000, Science.

[6]  M. Baldwin Protein Identification by Mass Spectrometry , 2004, Molecular & Cellular Proteomics.

[7]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[8]  A. Kerlavage,et al.  Complementary DNA sequencing: expressed sequence tags and human genome project , 1991, Science.

[9]  F. Miller,et al.  Profiling the progression of cancer: Separation of microsomal proteins in MCF10 breast epithelial cell lines using nonporous chromatophoresis , 2003, Proteomics.

[10]  J. A. Taylor,et al.  Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. , 1997, Rapid communications in mass spectrometry : RCM.

[11]  Young Mok Park,et al.  Efficiency improvement of peptide identification for an organism without complete genome sequence, using expressed sequence tag database and tandem mass spectral data , 2003, Proteomics.

[12]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[13]  K. Khoo,et al.  Strategic shotgun proteomics approach for efficient construction of an expression map of targeted protein families in hepatoma cell lines , 2003, Proteomics.

[14]  Hiroyuki Kaji,et al.  Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry. , 2003, Journal of proteome research.

[15]  R. Aebersold,et al.  ProbID: A probabilistic algorithm to identify peptides through sequence database searching using tandem mass spectral data , 2002, Proteomics.

[16]  Gilbert S Omenn,et al.  An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: Sensitivity and specificity analysis , 2005, Proteomics.

[17]  R. Beavis,et al.  A method for reducing the time required to match protein sequences with tandem mass spectra. , 2003, Rapid communications in mass spectrometry : RCM.

[18]  David Fenyö,et al.  RADARS, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database , 2002, Proteomics.

[19]  Error-tolerant protein database searching using peptide product-ion spectra. , 1995, Rapid communications in mass spectrometry : RCM.

[20]  T. Kuang,et al.  Proteomics approach to identify wound‐response related proteins from rice leaf sheath , 2003, Proteomics.