Whole proteome analysis of post-translational modifications: applications of mass-spectrometry for proteogenomic annotation.

While bacterial genome annotations have significantly improved in recent years, techniques for bacterial proteome annotation (including post-translational chemical modifications, signal peptides, proteolytic events, etc.) are still in their infancy. At the same time, the number of sequenced bacterial genomes is rising sharply, far outpacing our ability to validate the predicted genes, let alone annotate bacterial proteomes. In this study, we use tandem mass spectrometry (MS/MS) to annotate the proteome of Shewanella oneidensis MR-1, an important microbe for bioremediation. In particular, we provide the first comprehensive map of post-translational modifications in a bacterial genome, including a large number of chemical modifications, signal peptide cleavages, and cleavages of N-terminal methionine residues. We also detect multiple genes that were missed or assigned incorrect start positions by gene prediction programs, and suggest corrections to improve the gene annotation. This study demonstrates that complementing every genome sequencing project by an MS/MS project would significantly improve both genome and proteome annotations for a reasonable cost.

[1]  C. Trachsel,et al.  Human Blood Plasma Proteins , 2008 .

[2]  E. Marcotte,et al.  Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation , 2007, Nature Biotechnology.

[3]  M. Savitski,et al.  Extent of Modifications in Human Proteome Samples and Their Effect on Dynamic Range of Analysis in Shotgun Proteomics*S , 2006, Molecular & Cellular Proteomics.

[4]  Thierry Meinnel,et al.  The Proteomics of N-terminal Methionine Cleavage*S , 2006, Molecular & Cellular Proteomics.

[5]  Andrei L Osterman,et al.  Comparative Genomics and Experimental Characterization of N-Acetylglucosamine Utilization Pathway of Shewanella oneidensis* , 2006, Journal of Biological Chemistry.

[6]  Richard D. Smith,et al.  Confirmation of the expression of a large set of conserved hypothetical proteins in Shewanella oneidensis MR-1. , 2006, Journal of microbiological methods.

[7]  James P. Reilly,et al.  A computational approach toward label-free protein quantification using predicted peptide detectability , 2006, ISMB.

[8]  M. Riley,et al.  Genomic Analysis of Carbon Source Metabolism of Shewanella oneidensis MR-1: Predictions versus Experiments , 2006, Journal of Bacteriology.

[9]  O. Jensen Interpreting the protein language using proteomics , 2006, Nature Reviews Molecular Cell Biology.

[10]  Damian Fermin,et al.  Novel gene and gene model detection using a whole genome open reading frame analysis in proteomics , 2006, Genome Biology.

[11]  H. Mori,et al.  Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection , 2006, Molecular systems biology.

[12]  P. Pevzner,et al.  Unrestrictive identification of post-translational modifications through peptide mass spectrometry , 2006, Nature Protocols.

[13]  Dekel Tsur,et al.  Identification of post-translational modifications by blind search of mass spectra , 2005, Nature Biotechnology.

[14]  Naryttza N. Diaz,et al.  The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes , 2005, Nucleic acids research.

[15]  Akhilesh Pandey,et al.  Genome annotation of Anopheles gambiae using mass spectrometry-derived data , 2005, BMC Genomics.

[16]  Dekel Tsur,et al.  Identification of post-translational modifications via blind search of mass-spectra , 2005, 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05).

[17]  Rong Wang,et al.  Mass spectrometry of the M. smegmatis proteome: protein expression levels correlate with function, operons, and codon bias. , 2005, Genome research.

[18]  Matthew E Monroe,et al.  Global detection and characterization of hypothetical proteins in Shewanella oneidensis MR‐1 using LC‐MS based proteomics , 2005, Proteomics.

[19]  Mark Borodovsky,et al.  GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses , 2005, Nucleic Acids Res..

[20]  Jing-lan Wang,et al.  Identification of degradation products formed during performic oxidation of peptides and proteins by high-performance liquid chromatography with matrix-assisted laser desorption/ionization and tandem mass spectrometry. , 2005, Rapid communications in mass spectrometry : RCM.

[21]  Gordon A Anderson,et al.  Global profiling of Shewanella oneidensis MR-1: expression of hypothetical genes and improved functional annotations. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Vineet Bafna,et al.  InsPecT : Fast and accurate identification of post-translationally modified peptides from tandem mass spectra , 2005 .

[23]  Matthew E Monroe,et al.  Validation of Shewanella oneidensis MR-1 small proteins by AMT tag-based proteome analysis. , 2004, Omics : a journal of integrative biology.

[24]  Jacob D. Jaffe,et al.  The complete genome and proteome of Mycoplasma mobile. , 2004, Genome research.

[25]  S. Brunak,et al.  Improved prediction of signal peptides: SignalP 3.0. , 2004, Journal of molecular biology.

[26]  Dieter Jahn,et al.  PrediSi: prediction of signal peptides and their cleavage positions , 2004, Nucleic Acids Res..

[27]  M. Mann,et al.  Trypsin Cleaves Exclusively C-terminal to Arginine and Lysine Residues*S , 2004, Molecular & Cellular Proteomics.

[28]  D. Creasy,et al.  Unimod: Protein modifications for mass spectrometry , 2004, Proteomics.

[29]  John S Garavelli,et al.  The RESID Database of Protein Modifications as a resource and annotation tool , 2004, Proteomics.

[30]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[31]  O. White,et al.  Environmental Genome Shotgun Sequencing of the Sargasso Sea , 2004, Science.

[32]  Katalin F. Medzihradszky,et al.  Factors that contribute to the complexity of protein digests , 2004 .

[33]  Jacob D. Jaffe,et al.  Proteogenomic mapping as a complementary method to perform genome annotation , 2004, Proteomics.

[34]  Kenneth H. Nealson,et al.  Breathing metals as a way of life: geobiology in action , 2002, Antonie van Leeuwenhoek.

[35]  C. Gualerzi,et al.  Cloning and characterization of a gene cluster from Bacillus stearothermophilus comprising infC, rpmI and rplT , 1989, Molecular and General Genetics MGG.

[36]  F. Chang Methylation of ribosomal proteins during ribosome assembly in Escherichia coli , 2004, Molecular and General Genetics MGG.

[37]  Eugene V. Koonin,et al.  Comparative genomics, minimal gene-sets and the last universal common ancestor , 2003, Nature Reviews Microbiology.

[38]  J. W. Campbell,et al.  Experimental Determination and System Level Analysis of Essential Genes in Escherichia coli MG1655 , 2003, Journal of bacteriology.

[39]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[40]  T. Tatusova,et al.  Reannotation of Shewanella oneidensis genome. , 2003, Omics : a journal of integrative biology.

[41]  Raymond F. Gesteland,et al.  Recode 2003 , 2003, Nucleic Acids Res..

[42]  O. White,et al.  Genome sequence of the dissimilatory metal ion–reducing bacterium Shewanella oneidensis , 2002, Nature Biotechnology.

[43]  L. Kirsch,et al.  The relative rates of glutamine and asparagine deamidation in glucagon fragment 22-29 under acidic conditions. , 2002, Journal of pharmaceutical sciences.

[44]  John R Yates,et al.  Parallel identification of new genes in Saccharomyces cerevisiae. , 2002, Genome research.

[45]  C. James,et al.  A New UAG-Encoded Residue in the Structure of a Methanogen Methyltransferase , 2002, Science.

[46]  Pavel V Baranov,et al.  Recoding: translational bifurcations in gene expression. , 2002, Gene.

[47]  J. Boyd,et al.  Cyclization of N-terminal S-carbamoylmethylcysteine causing loss of 17 Da from peptides and extra peaks in peptide maps. , 2002, Journal of proteome research.

[48]  Måns Ehrenberg,et al.  The hemK gene in Escherichia coli encodes the N5‐glutamine methyltransferase that modifies peptide release factors , 2002, The EMBO journal.

[49]  M. Paetzel,et al.  Signal peptidases. , 2002, Chemical reviews.

[50]  Jan Maarten van Dijl,et al.  A proteomic view on genome-based signal peptide predictions. , 2001, Genome research.

[51]  E. Boja,et al.  Overalkylation of a protein digest with iodoacetamide. , 2001, Analytical chemistry.

[52]  P. Mortensen,et al.  Mass spectrometry allows direct identification of proteins in large genomes , 2001, Proteomics.

[53]  P. Demirev,et al.  Characterization of intact microorganisms by MALDI mass spectrometry. , 2001, Mass spectrometry reviews.

[54]  M. Mann,et al.  Use of mass spectrometry-derived data to annotate nucleotide and protein sequence databases. , 2001, Trends in biochemical sciences.

[55]  F. Lottspeich,et al.  Deamidation as a widespread phenomenon in two‐dimensional polyacrylamide gel electrophoresis of human blood plasma proteins , 2000, Electrophoresis.

[56]  V. N. Lapko,et al.  Identification of an artifact in the mass spectrometry of proteins derivatized with iodoacetamide. , 2000, Journal of mass spectrometry : JMS.

[57]  A Bairoch,et al.  High-throughput mass spectrometric discovery of protein post-translational modifications. , 1999, Journal of molecular biology.

[58]  J. Reilly,et al.  Observation of Escherichia coli ribosomal proteins and their posttranslational modifications by mass spectrometry. , 1999, Analytical biochemistry.

[59]  I. Apostol,et al.  Carbamylation of cysteine: a potential artifact in peptide mapping of hemoglobins in the presence of urea. , 1999, Analytical biochemistry.

[60]  I. Humphery-Smith,et al.  Small genes/gene-products in Escherichia coli K-12. , 1998, FEMS microbiology letters.

[61]  R. Macnab,et al.  Translation of the Flagellar Gene fliO ofSalmonella typhimurium from Putative Tandem Starts , 1998, Journal of bacteriology.

[62]  Terry D. Lee,et al.  The identification of peptide modifications derived from gel‐separated proteins using electrospray triple quadrupole and ion trap analyses , 1998, Electrophoresis.

[63]  D. Volkin,et al.  Degradative covalent reactions important to protein stability , 1997, Molecular biotechnology.

[64]  S. Brunak,et al.  SHORT COMMUNICATION Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites , 1997 .

[65]  George M. Church,et al.  Comparing the predicted and observed properties of proteins encoded in the genome of Escherichia coli K‐12 , 1997, Electrophoresis.

[66]  J. Kowalak,et al.  β‐Methylthio‐aspartic acid: Identification of a novel posttranslational modification in ribosomal protein S12 from escherichia coli , 1996, Protein science : a publication of the Protein Society.

[67]  M. Springer,et al.  The role of the AUU initiation codon in the negative feedback regulation of the gene for translation initiation factor IF3 in Escherichia coli , 1996, Molecular microbiology.

[68]  R. Simons,et al.  Escherichia coli translation initiation factor 3 discriminates the initiation codon in vivo , 1996, Molecular microbiology.

[69]  M. van de Weert,et al.  Identification of oxidized methionine in peptides. , 1996, Rapid communications in mass spectrometry : RCM.

[70]  J. Yates,et al.  Mining genomes: correlating tandem mass spectra of modified and unmodified peptides to sequences in nucleotide databases. , 1995, Analytical chemistry.

[71]  J. J. Schwartz,et al.  Molecular cloning and sequencing of infC, the gene encoding translation initiation factor IF3, from four enterobacterial species. , 1993, FEMS microbiology letters.

[72]  W. S. Hu,et al.  Identification of a putative infC-rpmI-rplT operon flanked by long inverted repeats in Mycoplasma fermentans (incognitus strain). , 1993, Gene.

[73]  J. Tobias,et al.  The N-end rule in bacteria. , 1991, Science.

[74]  S. Aizawa,et al.  Amino acids responsible for flagellar shape are distributed in terminal regions of flagellin. , 1991, Journal of molecular biology.

[75]  F. Dahlquist,et al.  Sites of deamidation and methylation in Tsr, a bacterial chemotaxis sensory transducer. , 1991, The Journal of biological chemistry.

[76]  H. H. Sørensen,et al.  Strategies for determination of disulphide bridges in proteins using plasma desorption mass spectrometry. , 1990, Biomedical & environmental mass spectrometry.

[77]  B. Chait,et al.  Influence of ions on cyclization of the amino terminal glutamine residues of tryptic peptides of streptococcal PepM49 protein. Resolution of cyclized peptides by HPLC and characterization by mass spectrometry. , 2009, International journal of peptide and protein research.

[78]  P. Dessen,et al.  Extent of N-terminal methionine excision from Escherichia coli proteins is governed by the side-chain length of the penultimate amino acid. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[79]  K. Soda,et al.  Selenocysteine. , 2020, Methods in enzymology.

[80]  K. Myambo,et al.  Processing of the initiation methionine from proteins: properties of the Escherichia coli methionine aminopeptidase and its gene structure , 1987, Journal of bacteriology.

[81]  F. Dahlquist,et al.  Multiple covalent modifications of Trg, a sensory transducer of Escherichia coli. , 1983, The Journal of biological chemistry.

[82]  M Grunberg-Manago,et al.  Sequence of a 1.26‐kb DNA fragment containing the structural gene for E.coli initiation factor IF3: presence of an AUU initiator codon. , 1982, The EMBO journal.

[83]  M. Ross,et al.  Purified human growth hormone from E. coli is biologically active , 1981, Nature.

[84]  R. Laursen,et al.  Location of the site of methylation in elongation factor Tu , 1979, FEBS letters.

[85]  J. Adler,et al.  Isolation of glutamic acid methyl ester from an Escherichia coli membrane protein involved in chemotaxis. , 1977, The Journal of biological chemistry.

[86]  A. Yaron [50] Dipeptidyl carboxypeptidase from Escherichia coli , 1976 .

[87]  A. Yaron Dipeptidyl carboxypeptidase from Escherichia coli. , 1976, Methods in enzymology.

[88]  J. Shine,et al.  The 3'-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. , 1974, Proceedings of the National Academy of Sciences of the United States of America.

[89]  S. Tronick,et al.  Methylation of the Flagellin of Salmonella typhimurium , 1971, Journal of bacteriology.