Comparative Functional Analysis of the Caenorhabditis elegans and Drosophila melanogaster Proteomes

The nematode Caenorhabditis elegans is a popular model system in genetics, not least because a majority of human disease genes are conserved in C. elegans. To generate a comprehensive inventory of its expressed proteome, we performed extensive shotgun proteomics and identified more than half of all predicted C. elegans proteins. This allowed us to confirm and extend genome annotations, characterize the role of operons in C. elegans, and semiquantitatively infer abundance levels for thousands of proteins. Furthermore, for the first time to our knowledge, we were able to compare two animal proteomes (C. elegans and Drosophila melanogaster). We found that the abundances of orthologous proteins in metazoans correlate remarkably well, better than protein abundance versus transcript abundance within each organism or transcript abundances across organisms; this suggests that changes in transcript abundance may have been partially offset during evolution by opposing changes in protein abundance.

[1]  D L Riddle,et al.  Gene expression profiling of cells, tissues, and developmental stages of the nematode C. elegans. , 2003, Cold Spring Harbor symposia on quantitative biology.

[2]  P. Sengupta,et al.  The divergent orphan nuclear receptor ODR-7 regulates olfactory neuron gene expression via multiple mechanisms in Caenorhabditis elegans. , 2003, Genetics.

[3]  S. Bergmann,et al.  Similarities and Differences in Genome-Wide Expression Data of Six Organisms , 2003, PLoS biology.

[4]  Jian Wang,et al.  Detecting novel low-abundant transcripts in Drosophila. , 2005, RNA.

[5]  J. Derisi,et al.  Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise , 2006, Nature.

[6]  J. Berg Genome sequence of the nematode C. elegans: a platform for investigating biology. , 1998, Science.

[7]  Scott A. Busby,et al.  Genomic and functional evolution of the Drosophila melanogaster sperm proteome , 2006, Nature Genetics.

[8]  Andrew Smith Genome sequence of the nematode C-elegans: A platform for investigating biology , 1998 .

[9]  Thomas Blumenthal,et al.  Coexpression of neighboring genes in Caenorhabditis elegans is mostly due to operons and duplicate genes. , 2003, Genome research.

[10]  X. Gu,et al.  Expression divergence between duplicate genes. , 2005, Trends in genetics : TIG.

[11]  Cornelia I Bargmann,et al.  Reprogramming Chemotaxis Responses: Sensory Neurons Define Olfactory Preferences in C. elegans , 1997, Cell.

[12]  E. O’Shea,et al.  Global analysis of protein expression in yeast , 2003, Nature.

[13]  Thomas Blumenthal,et al.  Operons in eukaryotes. , 2004, Briefings in functional genomics & proteomics.

[14]  E. Hafen,et al.  A Proteome Catalog of Drosophila melanogaster: An Essential Resource for Targeted Quantitative Proteomics , 2007, Fly.

[15]  M. Mann,et al.  Status of complete proteome analysis by mass spectrometry: SILAC labeled yeast as a model system , 2006, Genome Biology.

[16]  J. Dow,et al.  Using FlyAtlas to identify better Drosophila melanogaster models of human disease , 2007, Nature Genetics.

[17]  Timothy Hughes,et al.  The Pattern of Evolution of Smaller-Scale Gene Duplicates in Mammalian Genomes is More Consistent with Neo- than Subfunctionalisation , 2007, Journal of Molecular Evolution.

[18]  Hiroyuki Kaji,et al.  Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry. , 2003, Journal of proteome research.

[19]  Thomas Blumenthal,et al.  Caenorhabditis elegans operons: form and function , 2003, Nature Reviews Genetics.

[20]  Christian von Mering,et al.  STRING 7—recent developments in the integration and prediction of protein interactions , 2006, Nucleic Acids Res..

[21]  M. Gerstein,et al.  Comparing protein abundance and mRNA expression levels on a genomic scale , 2003, Genome Biology.

[22]  J. Yates,et al.  A model for random sampling and estimation of relative protein abundance in shotgun proteomics. , 2004, Analytical chemistry.

[23]  R. Aebersold,et al.  A uniform proteomics MS/MS analysis platform utilizing open XML file formats , 2005, Molecular systems biology.

[24]  K. H. Wolfe,et al.  A burst of protein sequence evolution and a prolonged period of asymmetric evolution follow gene duplication in yeast. , 2007, Genome research.

[25]  A. E. Hirsh,et al.  Noise Minimization in Eukaryotic Gene Expression , 2004, PLoS biology.

[26]  E. Koonin Orthologs, paralogs, and evolutionary genomics. , 2005, Annual review of genetics.

[27]  A. Krogh,et al.  A combined transmembrane topology and signal peptide prediction method. , 2004, Journal of molecular biology.

[28]  Jianzhi Zhang,et al.  Rapid Subfunctionalization Accompanied by Prolonged and Substantial Neofunctionalization in Duplicate Gene Evolution , 2005, Genetics.

[29]  Martin J. Lercher,et al.  Clustering of housekeeping genes provides a unified model of gene order in the human genome , 2002, Nature Genetics.

[30]  G. von Heijne,et al.  A global topology map of the Saccharomyces cerevisiae membrane proteome. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[31]  James H. Thomas,et al.  The putative chemoreceptor families of C. elegans. , 2006, WormBook : the online review of C. elegans biology.

[32]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[33]  J. Seilhamer,et al.  A comparison of selected mRNA and protein abundances in human liver , 1997, Electrophoresis.

[34]  E. Marcotte,et al.  Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation , 2007, Nature Biotechnology.

[35]  Peer Bork,et al.  Similar gene expression profiles do not imply similar tissue functions. , 2006, Trends in genetics : TIG.

[36]  Michelle S. Scott,et al.  Global Survey of Organ and Organelle Protein Expression in Mouse: Combined Proteomic and Transcriptomic Profiling , 2006, Cell.

[37]  Jodie J. Yin,et al.  A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes , 2004, Genome Biology.

[38]  S. Pääbo,et al.  A Neutral Model of Transcriptome Evolution , 2004, PLoS biology.

[39]  S. Gygi,et al.  Quantitative analysis of complex protein mixtures using isotope-coded affinity tags , 1999, Nature Biotechnology.

[40]  Patrick G. A. Pedrioli,et al.  A high-quality catalog of the Drosophila melanogaster proteome , 2007, Nature Biotechnology.

[41]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[42]  S. Wang,et al.  Understanding SAGE data. , 2007, Trends in genetics : TIG.

[43]  Daniel B. Goodman,et al.  Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes. , 2008, Genome research.

[44]  S. Gygi,et al.  Correlation between Protein and mRNA Abundance in Yeast , 1999, Molecular and Cellular Biology.

[45]  E. Sonnhammer,et al.  OrthoDisease: A database of human disease orthologs , 2004, Human mutation.

[46]  Michael K. Coleman,et al.  Correlation of relative abundance ratios derived from peptide ion chromatograms and spectrum counting for quantitative proteomic analysis using stable isotope labeling. , 2005, Analytical chemistry.

[47]  Michael J MacCoss,et al.  Use of shotgun proteomics for the identification, confirmation, and correction of C. elegans gene annotations. , 2008, Genome research.

[48]  M. Lynch,et al.  The transcriptional consequences of mutation and natural selection in Caenorhabditis elegans , 2005, Nature Genetics.

[49]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[50]  E. Koonin Orthologs, Paralogs, and Evolutionary Genomics 1 , 2005 .

[51]  N. Patel,et al.  Evidence for stabilizing selection in a eukaryotic enhancer element , 2000, Nature.

[52]  G. Jansen,et al.  Noncell- and Cell-Autonomous G-Protein-Signaling Converges With Ca2+/Mitogen-Activated Protein Kinase Signaling to Regulate str-2 Receptor Gene Expression in Caenorhabditis elegans , 2006, Genetics.

[53]  Wen-Hsiung Li,et al.  Different age distribution patterns of human, nematode, and Arabidopsis duplicate genes. , 2004, Gene.

[54]  Richard R Copley,et al.  The animal in the genome: comparative genomics and evolution , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[55]  E. O’Shea,et al.  Quantification of protein half-lives in the budding yeast proteome , 2006, Proceedings of the National Academy of Sciences.

[56]  C. Wahlestedt,et al.  A two‐dimensional protein map of Caenorhabditis elegans , 2001, Electrophoresis.

[57]  G. von Heijne,et al.  Global Topology Analysis of the Escherichia coli Inner Membrane Proteome , 2005, Science.

[58]  P. Bork,et al.  Co-evolution of transcriptional and post-translational cell-cycle regulation , 2006, Nature.

[59]  Erik L. L. Sonnhammer,et al.  InParanoid 6: eukaryotic ortholog clusters with inparalogs , 2007, Nucleic Acids Res..

[60]  C. Pál,et al.  An integrated view of protein evolution , 2006, Nature Reviews Genetics.

[61]  R. Aebersold,et al.  Scoring proteomes with proteotypic peptide probes , 2005, Nature Reviews Molecular Cell Biology.

[62]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[63]  Daniel B. Martin,et al.  Computational prediction of proteotypic peptides for quantitative proteomics , 2007, Nature Biotechnology.

[64]  E. Marcotte,et al.  Calculating absolute and relative protein abundance from mass spectrometry-based protein expression data , 2008, Nature Protocols.