Stability along with Extreme Variability in Core Genome Evolution

The shape of the distribution of evolutionary distances between orthologous genes in pairs of closely related genomes is universal throughout the entire range of cellular life forms. The near invariance of this distribution across billions of years of evolution can be accounted for by the Universal Pace Maker (UPM) model of genome evolution that yields a significantly better fit to the phylogenetic data than the Molecular Clock (MC) model. Unlike the MC, the UPM model does not assume constant gene-specific evolutionary rates but rather postulates that, in each evolving lineage, the evolutionary rates of all genes change (approximately) in unison although the pacemakers of different lineages are not necessarily synchronized. Here, we dissect the nearly constant evolutionary rate distribution by comparing the genome-wide relative rates of evolution of individual genes in pairs or triplets of closely related genomes from diverse bacterial and archaeal taxa. We show that, although the gene-specific relative rate is an important feature of genome evolution that explains more than half of the variance of the evolutionary distances, the ranges of relative rate variability are extremely broad even for universal genes. Because of this high variance, the gene-specific rate is a poor predictor of the conservation rank for any gene in any particular lineage.

[1]  M. Kimura,et al.  The neutral theory of molecular evolution. , 1983, Scientific American.

[2]  N. Takahata,et al.  On the overdispersed molecular clock. , 1987, Genetics.

[3]  J. Felsenstein Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. , 1996, Methods in enzymology.

[4]  Gapped BLAST and PSI-BLAST: A new , 1997 .

[5]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[6]  D. Cutler,et al.  Understanding the overdispersed molecular clock. , 2000, Genetics.

[7]  N. Grishin,et al.  From complete genomes to measures of substitution rate variability within and between proteins. , 2000, Genome research.

[8]  E. Koonin,et al.  Horizontal gene transfer in prokaryotes: quantification and classification. , 2001, Annual review of microbiology.

[9]  D. Penny,et al.  The modern molecular clock , 2003, Nature Reviews Genetics.

[10]  D. M. Krylov,et al.  Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. , 2003, Genome research.

[11]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[12]  Eugene V. Koonin,et al.  Comparative genomics, minimal gene-sets and the last universal common ancestor , 2003, Nature Reviews Microbiology.

[13]  C. Wilke Molecular clock in neutral protein evolution , 2004, BMC Genetics.

[14]  K. Holsinger The neutral theory of molecular evolution , 2004 .

[15]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[16]  Emile Zuckerkandl,et al.  On the molecular evolutionary clock , 2005, Journal of Molecular Evolution.

[17]  E. Koonin Orthologs, paralogs, and evolutionary genomics. , 2005, Annual review of genetics.

[18]  David Joyner,et al.  SAGE: system for algebra and geometry experimentation , 2005, SIGS.

[19]  E. Koonin Orthologs, Paralogs, and Evolutionary Genomics 1 , 2005 .

[20]  Liran Carmel,et al.  Unifying measures of gene function and evolution , 2006, Proceedings of the Royal Society B: Biological Sciences.

[21]  Roded Sharan,et al.  Gene loss rate: a probabilistic measure for the conservation of eukaryotic genes , 2006, Nucleic acids research.

[22]  Claus O. Wilke,et al.  Mistranslation-Induced Protein Misfolding as a Dominant Constraint on Coding-Sequence Evolution , 2008, Cell.

[23]  Trevor Bedford,et al.  Overdispersion of the molecular clock: temporal variation of gene-specific substitution rates in Drosophila. , 2008, Molecular biology and evolution.

[24]  Trevor Bedford,et al.  Overdispersion of the Molecular Clock Varies Between Yeast, Drosophila and Mammals , 2008, Genetics.

[25]  Eugene V Koonin,et al.  The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages , 2009, Proceedings of the National Academy of Sciences.

[26]  E. Koonin,et al.  Search for a 'Tree of Life' in the thicket of the phylogenetic forest , 2009, Journal of biology.

[27]  Christophe Dessimoz,et al.  Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods , 2009, PLoS Comput. Biol..

[28]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[29]  Arcady R. Mushegian,et al.  Computational methods for Gene Orthology inference , 2011, Briefings Bioinform..

[30]  Eugene V. Koonin,et al.  A Tight Link between Orthologs and Bidirectional Best Hits in Bacterial and Archaeal Genomes , 2012, Genome biology and evolution.

[31]  Eugene V. Koonin,et al.  Phylogenomics of Prokaryotic Ribosomal Proteins , 2012, PloS one.

[32]  Sagi Snir,et al.  Universal Pacemaker of Genome Evolution , 2012, PLoS Comput. Biol..

[33]  E. Koonin,et al.  Functional and evolutionary implications of gene orthology , 2013, Nature Reviews Genetics.