The balance of driving forces during genome evolution in prokaryotes.

Genomes are shaped by evolutionary processes such as gene genesis, horizontal gene transfer (HGT), and gene loss. To quantify the relative contributions of these processes, we analyze the distribution of 12,762 protein families on a phylogenetic tree, derived from entire genomes of 41 Bacteria and 10 Archaea. We show that gene loss is the most important factor in shaping genome content, being up to three times more frequent than HGT, followed by gene genesis, which may contribute up to twice as many genes as HGT. We suggest that gene gain and gene loss in prokaryotes are balanced; thus, on average, prokaryotic genome size is kept constant. Despite the importance of HGT, our results indicate that the majority of protein families have only been transmitted by vertical inheritance. To test our method, we present a study of strain-specific genes of Helicobacter pylori, and demonstrate correct predictions of gene loss and HGT for at least 81% of validated cases. This approach indicates that it is possible to trace genome content history and quantify the factors that shape contemporary prokaryotic genomes.

[1]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[2]  D. Zipkas,et al.  Proposal concerning mechanism of evolution of the genome of Escherichia coli. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[3]  T. Cavalier-smith The Evolution of genome size , 1985 .

[4]  C Ouzounis,et al.  The emergence of major cellular processes in evolution , 1996, FEBS letters.

[5]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[6]  K. H. Wolfe,et al.  Molecular evidence for an ancient duplication of the entire yeast genome , 1997, Nature.

[7]  Ross A. Overbeek,et al.  The RDP (Ribosomal Database Project) , 1997, Nucleic Acids Res..

[8]  Mark Borodovsky,et al.  The complete genome sequence of the gastric pathogen Helicobacter pylori , 1997, Nature.

[9]  Benjamin L. King,et al.  Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori , 1999, Nature.

[10]  J. Andersson,et al.  Insights into the evolutionary process of genome degradation. , 1999, Current opinion in genetics & development.

[11]  B. Snel,et al.  Genome phylogeny based on gene content , 1999, Nature Genetics.

[12]  C. Ouzounis Orthology: another terminology muddle. , 1999, Trends in genetics : TIG.

[13]  J. Eisen Horizontal gene transfer among microbial genomes: new insights from complete genome analysis. , 2000, Current opinion in genetics & development.

[14]  S. Garcia-Vallvé,et al.  Horizontal gene transfer in bacterial and archaeal complete genomes. , 2000, Genome research.

[15]  H. Ochman,et al.  Lateral gene transfer and the nature of bacterial innovation , 2000, Nature.

[16]  J. Andersson Evolutionary genomics: Is Buchnera a bacterium or an organelle? , 2000, Current Biology.

[17]  H. Ochman,et al.  Evolutionary dynamics of full genome content in Escherichia coli , 2000, The EMBO journal.

[18]  Nikos Kyrpides,et al.  Genomes OnLine Database (GOLD): a monitor of genome projects world-wide , 2001, Nucleic Acids Res..

[19]  B. Barrell,et al.  Massive gene decay in the leprosy , 2001 .

[20]  N. Moran,et al.  Deletional bias and the evolution of bacterial genomes. , 2001, Trends in genetics : TIG.

[21]  C. Ouzounis,et al.  Strain-specific genes of Helicobacter pylori: distribution, function and dynamics. , 2001, Nucleic acids research.

[22]  M. Ragan Detection of lateral gene transfer among microbial genomes. , 2001, Current opinion in genetics & development.

[23]  L. Koski,et al.  The Closest BLAST Hit Is Often Not the Nearest Neighbor , 2001, Journal of Molecular Evolution.

[24]  B. Barrell,et al.  Massive gene decay in the leprosy bacillus , 2001, Nature.

[25]  N. Moran,et al.  Microbial Minimalism Genome Reduction in Bacterial Pathogens , 2002, Cell.

[26]  B. Snel,et al.  Genomes in flux: the evolution of archaeal and proteobacterial gene content. , 2002, Genome research.

[27]  Russell F. Doolittle,et al.  Biodiversity: Microbial genomes multiply , 2002, Nature.

[28]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[29]  Christos A. Ouzounis,et al.  GeneTRACE - Reconstruction of Gene Content of Ancestral Species , 2003, Bioinform..

[30]  Harold J. Morowitz,et al.  Genome size and evolution , 1972, Chromosoma.