Global analysis of predicted proteomes: functional adaptation of physical properties.

The physical characteristics of proteins are fundamentally important in organismal function. We used the complete predicted proteomes of >100 organisms spanning the three domains of life to investigate the comparative biology and evolution of proteomes. Theoretical 2D gels were constructed with axes of protein mass and charge (pI) and converted to density estimates comparable across all types and sizes of proteome. We asked whether we could detect general patterns of proteome conservation and variation. The overall pattern of theoretical 2D gels was strongly conserved across all life forms. Nevertheless, coevolved replicons from the same organism (different chromosomes or plasmid and host chromosomes) encode proteomes more similar to each other than those from different organisms. Furthermore, there was disparity between the membrane and nonmembrane subproteomes within organisms (proteins of membrane proteomes are on the average more basic and heavier) and their variation across organisms, suggesting that membrane proteomes evolve most rapidly. Experimentally, a significant positive relationship independent of phylogeny was found between the predicted proteome and Biolog profile, a measure associated with the ecological niche. Finally, we show that, for the smallest and most alkaline proteomes, there is a negative relationship between proteome size and basicity. This relationship is not adequately explained by AT bias at the DNA sequence level. Together, these data provide evidence of functional adaptation in the properties of complete proteomes.

[1]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[2]  István Simon,et al.  The HMMTOP transmembrane topology prediction server , 2001, Bioinform..

[3]  K. Altland IPGMAKER: A program for IBM‐compatible personal computers to create and test recipes for immobilized pH gradients , 1990, Electrophoresis.

[4]  S. Karlin,et al.  Predicted Highly Expressed Genes of Diverse Prokaryotic Genomes , 2000, Journal of bacteriology.

[5]  G. Bell,et al.  Divergent evolution during an experimental adaptive radiation , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[6]  B. Rost,et al.  Comparing function and structure between entire proteomes , 2001, Protein science : a publication of the Protein Society.

[7]  Satoshi Fukuchi,et al.  Unique amino acid composition of proteins in halophilic bacteria. , 2003, Journal of molecular biology.

[8]  J. Celis,et al.  Reference points for comparisons of two‐dimensional maps of proteins from different human cell types defined in a pH scale where isoelectric points correlate with polypeptide compositions , 1994, Electrophoresis.

[9]  S. Karlin,et al.  Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Mark Borodovsky,et al.  The complete genome sequence of the gastric pathogen Helicobacter pylori , 1997, Nature.

[11]  R. Schwartz,et al.  Whole proteome pI values correlate with subcellular localizations of proteins for organisms within the three domains of life. , 2001, Genome research.

[12]  Dieter Söll,et al.  The genome of Nanoarchaeum equitans: Insights into early archaeal evolution and derived parasitism , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Alfonso Valencia,et al.  Reductive genome evolution in Buchnera aphidicola , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[14]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[15]  M. Hattori,et al.  Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS , 2000, Nature.

[16]  George M. Church,et al.  Comparing the predicted and observed properties of proteins encoded in the genome of Escherichia coli K‐12 , 1997, Electrophoresis.

[17]  Melissa J. Davis,et al.  Mouse proteome analysis. , 2003, Genome research.

[18]  Fredj Tekaia,et al.  Amino acid composition of genomes, lifestyles of organisms, and evolutionary trends: a global picture with correspondence analysis. , 2002, Gene.

[19]  L. Hood,et al.  Understanding the adaptation of Halobacterium species NRC-1 to its extreme environment through computational analysis of its genome sequence. , 2001, Genome research.

[20]  K Watanabe,et al.  Archaeal adaptation to higher temperatures revealed by genomic sequence of Thermoplasma volcanium. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[21]  N. Moran,et al.  Microbial Minimalism Genome Reduction in Bacterial Pathogens , 2002, Cell.

[22]  Samuel Karlin,et al.  Genome comparisons and analysis. , 2003, Current opinion in structural biology.

[23]  S. Karlin,et al.  Heterogeneity of genome and proteome content in bacteria, archaea, and eukaryotes. , 2002, Theoretical population biology.

[24]  Hervé Seligmann,et al.  Cost-Minimization of Amino Acid Usage , 2003, Journal of Molecular Evolution.

[25]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[26]  A. Osuna,et al.  Intracellular alkalinisation in Vero cells parasitised by Trypanosoma cruzi. , 1998, International journal for parasitology.

[27]  Katerina Michalickova,et al.  Species-specific protein sequence and fold optimizations , 2002, BMC Bioinformatics.

[28]  S J Cordwell,et al.  Comparison of predicted and observed properties of proteins encoded in the genome of Mycobacterium tuberculosis H37Rv. , 1998, Biochemical and biophysical research communications.

[29]  G. Heijne,et al.  Genome‐wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms , 1998, Protein science : a publication of the Protein Society.

[30]  J. Zhang,et al.  Protein-length distributions for the three domains of life. , 2000, Trends in genetics : TIG.

[31]  Debasis Dash,et al.  A Novel Complexity Measure for Comparative Analysis of Protein Sequences from Complete Genomes , 2003, Journal of biomolecular structure & dynamics.