Mass spectrometry of the M. smegmatis proteome: protein expression levels correlate with function, operons, and codon bias.

The fast-growing bacterium Mycobacterium smegmatis is a model mycobacterial system, a nonpathogenic soil bacterium that nonetheless shares many features with the pathogenic Mycobacterium tuberculosis, the causative agent of tuberculosis. The study of M. smegmatis is expected to shed light on mechanisms of mycobacterial growth and complex lipid metabolism, and provides a tractable system for antimycobacterial drug development. Although the M. smegmatis genome sequence is not yet completed, we used multidimensional chromatography and tandem mass spectrometry, in combination with the partially completed genome sequence, to detect and identify a total of 901 distinct proteins from M. smegmatis over the course of 25 growth conditions, providing experimental annotation for many predicted genes with an approximately 5% false-positive identification rate. We observed numerous proteins involved in energy production (9.8% of expressed proteins), protein translation (8.7%), and lipid biosynthesis (5.4%); 33% of the 901 proteins are of unknown function. Protein expression levels were estimated from the number of observations of each protein, allowing measurement of differential expression of complete operons, and the comparison of the stationary and exponential phase proteomes. Expression levels are correlated with proteins' codon biases and mRNA expression levels, as measured by comparison with codon adaptation indices, principle component analysis of codon frequencies, and DNA microarray data. This observation is consistent with notions that either (1) prokaryotic protein expression levels are largely preset by codon choice, or (2) codon choice is optimized for consistency with average expression levels regardless of the mechanism of regulating expression.

[1]  S. H. Kaufmann,et al.  Comparative proteome analysis of Mycobacterium tuberculosis and Mycobacterium bovis BCG strains: towards functional genomics of microbial pathogens , 1999, Molecular microbiology.

[2]  J. Yates,et al.  A model for random sampling and estimation of relative protein abundance in shotgun proteomics. , 2004, Analytical chemistry.

[3]  C. Ratledge,et al.  Mycobacteria : molecular biology and virulence , 1999 .

[4]  S. Cole,et al.  The evolution of mycobacterial pathogenicity: clues from comparative genomics. , 2001, Trends in microbiology.

[5]  J. Kormanec,et al.  Identification and transcriptional characterization of the gene encoding the stress‐response σ factor σH in Streptomyces coelicolor A3(2) , 2000 .

[6]  J. Yates,et al.  Large-scale analysis of the yeast proteome by multidimensional protein identification technology , 2001, Nature Biotechnology.

[7]  S. Gygi,et al.  Quantitative analysis of complex protein mixtures using isotope-coded affinity tags , 1999, Nature Biotechnology.

[8]  Michael Y. Galperin,et al.  The COG database: new developments in phylogenetic classification of proteins from complete genomes , 2001, Nucleic Acids Res..

[9]  Jacob D. Jaffe,et al.  Proteogenomic mapping as a complementary method to perform genome annotation , 2004, Proteomics.

[10]  V. Mizrahi,et al.  The Stringent Response of Mycobacterium tuberculosis Is Required for Long-Term Survival , 2000, Journal of bacteriology.

[11]  S. Salzberg,et al.  Improved microbial gene identification with GLIMMER. , 1999, Nucleic acids research.

[12]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[13]  B. Barrell,et al.  Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence , 1998, Nature.

[14]  J. Bennetzen,et al.  Codon selection in yeast. , 1982, The Journal of biological chemistry.

[15]  N. B. Harris,et al.  Mycobacterium smegmatis d-Alanine Racemase Mutants Are Not Dependent on d-Alanine for Growth , 2002, Antimicrobial Agents and Chemotherapy.

[16]  P. Sharp,et al.  The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. , 1987, Nucleic acids research.

[17]  Peter R. Jungblut,et al.  Proteomics Reveals Open Reading Frames inMycobacterium tuberculosis H37Rv Not Predicted by Genomics , 2001, Infection and Immunity.

[18]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[19]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[20]  A. Ishihama Functional modulation of Escherichia coli RNA polymerase. , 2000, Annual review of microbiology.

[21]  Mark Gerstein,et al.  Revisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models. , 2003, Nucleic acids research.

[22]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Ruedi Aebersold,et al.  Complementary Analysis of the Mycobacterium tuberculosis Proteome by Two-dimensional Electrophoresis and Isotope-coded Affinity Tag Technology * , 2004, Molecular & Cellular Proteomics.

[24]  Ka Yee Yeung,et al.  Principal component analysis for clustering gene expression data , 2001, Bioinform..

[25]  J. Betts,et al.  Evaluation of a nutrient starvation model of Mycobacterium tuberculosis persistence by gene and protein expression profiling , 2002, Molecular microbiology.

[26]  A. Moir,et al.  SigM, an Extracytoplasmic Function Sigma Factor of Bacillus subtilis, Is Activated in Response to Cell Wall Antibiotics, Ethanol, Heat, Acid, and Superoxide Stress , 2003, Journal of bacteriology.

[27]  J. Yates,et al.  DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics. , 2002, Journal of proteome research.

[28]  J. Yates,et al.  Direct analysis of protein complexes using mass spectrometry , 1999, Nature Biotechnology.

[29]  S. Gygi,et al.  Evaluation of two-dimensional gel electrophoresis-based proteome analysis technology. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Marc R Wilkins,et al.  Using proteomics to mine genome sequences. , 2004, Journal of proteome research.

[31]  Daniel B. Martin,et al.  Advances in quantitative proteomics using stable isotope tags. , 2002, Trends in biotechnology.

[32]  W. Jacobs,et al.  Genetic systems for mycobacteria. , 1991, Methods in enzymology.

[33]  S. H. Kaufmann,et al.  Comparative proteome analysis of culture supernatant proteins from virulent Mycobacterium tuberculosis H37Rv and attenuated M. bovis BCG Copenhagen , 2003, Electrophoresis.

[34]  B. Samten,et al.  The principal sigma factor sigA mediates enhanced growth of Mycobacterium tuberculosis in vivo , 2004, Molecular microbiology.

[35]  Raymond Liu,et al.  Phylogenetic Analysis of L4-Mediated Autogenous Control of the S10 Ribosomal Protein Operon , 1999, Journal of bacteriology.

[36]  B. Futcher,et al.  A Sampling of the Yeast Proteome , 1999, Molecular and Cellular Biology.

[37]  Markus J. Herrgård,et al.  Integrating high-throughput and computational data elucidates bacterial networks , 2004, Nature.

[38]  W. Jacobs,et al.  inhA, a gene encoding a target for isoniazid and ethionamide in Mycobacterium tuberculosis. , 1994, Science.

[39]  V. Deretic,et al.  An Essential Two-Component Signal Transduction System in Mycobacterium tuberculosis , 2000, Journal of bacteriology.

[40]  P. Brown,et al.  Exploring drug-induced alterations in gene expression in Mycobacterium tuberculosis by microarray hybridization. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Rong Wang,et al.  The need for a public proteomics repository , 2004, Nature Biotechnology.

[42]  R. Overbeek,et al.  The use of gene clusters to infer functional coupling. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Joshua E. Elias,et al.  Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. , 2003, Journal of proteome research.