Ancestral paralogs and pseudoparalogs and their role in the emergence of the eukaryotic cell

Gene duplication is a crucial mechanism of evolutionary innovation. A substantial fraction of eukaryotic genomes consists of paralogous gene families. We assess the extent of ancestral paralogy, which dates back to the last common ancestor of all eukaryotes, and examine the origins of the ancestral paralogs and their potential roles in the emergence of the eukaryotic cell complexity. A parsimonious reconstruction of ancestral gene repertoires shows that 4137 orthologous gene sets in the last eukaryotic common ancestor (LECA) map back to 2150 orthologous sets in the hypothetical first eukaryotic common ancestor (FECA) [paralogy quotient (PQ) of 1.92]. Analogous reconstructions show significantly lower levels of paralogy in prokaryotes, 1.19 for archaea and 1.25 for bacteria. The only functional class of eukaryotic proteins with a significant excess of paralogous clusters over the mean includes molecular chaperones and proteins with related functions. Almost all genes in this category underwent multiple duplications during early eukaryotic evolution. In structural terms, the most prominent sets of paralogs are superstructure-forming proteins with repetitive domains, such as WD-40 and TPR. In addition to the true ancestral paralogs which evolved via duplication at the onset of eukaryotic evolution, numerous pseudoparalogs were detected, i.e. homologous genes that apparently were acquired by early eukaryotes via different routes, including horizontal gene transfer (HGT) from diverse bacteria. The results of this study demonstrate a major increase in the level of gene paralogy as a hallmark of the early evolution of eukaryotes.

[1]  W. Doolittle,et al.  Reconstructing/Deconstructing the Earliest Eukaryotes How Comparative Genomics Can Help , 2001, Cell.

[2]  A. Hughes,et al.  Gene duplication and the structure of eukaryotic genomes. , 2001, Genome research.

[3]  C. Gille,et al.  A comprehensive view on proteasomal sequences: implications for the evolution of the proteasome. , 2003, Journal of molecular biology.

[4]  P. Forterre,et al.  Evolution of the Archaea. , 2002, Theoretical population biology.

[5]  Y. Dong,et al.  Systematic functional analysis of the Caenorhabditis elegans genome using RNAi , 2003, Nature.

[6]  Calvin B. Bridges,et al.  SALIVARY CHROMOSOME MAPSWith a Key to the Banding of the Chromosomes of Drosophila Melanogaster , 1935 .

[7]  D. Goldfarb,et al.  Evolutionary specialization of the nuclear targeting apparatus. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[8]  B. Birren,et al.  Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae , 2004, Nature.

[9]  Karsten Hokamp,et al.  Extensive genomic duplication during early chordate evolution , 2002, Nature Genetics.

[10]  Michael Y. Galperin,et al.  Comparative genomics of the Archaea (Euryarchaeota): evolution of conserved protein families, the stable core, and the variable shell. , 1999, Genome research.

[11]  P. Forterre,et al.  Evolution of DNA Polymerase Families: Evidences for Multiple Gene Exchange Between Cellular and Viral Proteins , 2002, Journal of Molecular Evolution.

[12]  M. Lynch,et al.  The Origins of Genome Complexity , 2003, Science.

[13]  Jodie J. Yin,et al.  A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes , 2004, Genome Biology.

[14]  M. Schwab Amplification of oncogenes in human cancer cells , 1998, BioEssays : news and reviews in molecular, cellular and developmental biology.

[15]  Ronald W. Davis,et al.  Functional profiling of the Saccharomyces cerevisiae genome , 2002, Nature.

[16]  E. Koonin,et al.  Horizontal gene transfer in prokaryotes: quantification and classification. , 2001, Annual review of microbiology.

[17]  J. Felsenstein Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. , 1996, Methods in enzymology.

[18]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[19]  S. Kearsey,et al.  MCM proteins: evolution, properties, and role in DNA replication. , 1998, Biochimica et biophysica acta.

[20]  E V Koonin,et al.  Prediction of the archaeal exosome and its connections with the proteasome and the translation and transcription machineries by a comparative-genomic approach. , 2001, Genome research.

[21]  W. Dalton,et al.  The proteasome. , 2004, Seminars in oncology.

[22]  M. Taira,et al.  Gene Amplification , 2020, Definitions.

[23]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[24]  Hiroyuki Araki,et al.  GINS, a novel multiprotein complex required for chromosomal DNA replication in budding yeast. , 2003, Genes & development.

[25]  W. Doolittle,et al.  Gene duplication and the evolution of group II chaperonins: implications for structure and function. , 2001, Journal of Structural Biology.

[26]  C. Pál,et al.  Dosage sensitivity and the evolution of gene families in yeast , 2003, Nature.

[27]  Eugene V Koonin,et al.  A Non-Adaptationist Perspective on Evolution of Genomic Complexity or the Continued Dethroning of Man , 2004, Cell cycle.

[28]  A. F. Neuwald,et al.  HEAT repeats associated with condensins, cohesins, and other complexes involved in chromosome-related functions. , 2000, Genome research.

[29]  Andrew J. Roger,et al.  Reconstructing Early Events in Eukaryotic Evolution , 1999, The American Naturalist.

[30]  A. Hughes,et al.  Pattern and timing of gene duplication in animal genomes. , 2001, Genome research.

[31]  R Palacios,et al.  Gene amplification and genomic plasticity in prokaryotes. , 1997, Annual review of genetics.

[32]  J. Shabanowitz,et al.  A large nucleolar U3 ribonucleoprotein required for 18S ribosomal RNA biogenesis , 2002, Nature.

[33]  E. Koonin,et al.  Trends in protein evolution inferred from sequence and structure analysis. , 2002, Current opinion in structural biology.

[34]  K. H. Wolfe,et al.  Molecular evidence for an ancient duplication of the entire yeast genome , 1997, Nature.

[35]  J. Adachi,et al.  MOLPHY, programs for molecular phylogenetics , 1992 .

[36]  H. Araki,et al.  A novel ring-like complex of Xenopus proteins essential for the initiation of DNA replication. , 2003, Genes & development.

[37]  Benjamin A. Shoemaker,et al.  CDD: a database of conserved domain alignments with links to domain three-dimensional structure , 2002, Nucleic Acids Res..

[38]  J. Archambault,et al.  Genetics of eukaryotic RNA polymerases I, II, and III. , 1993, Microbiological reviews.

[39]  W. Doolittle,et al.  A kingdom-level phylogeny of eukaryotes based on combined protein data. , 2000, Science.

[40]  E. Koonin Orthologs, paralogs, and evolutionary genomics. , 2005, Annual review of genetics.

[41]  John S. Conery,et al.  The evolutionary demography of duplicate genes , 2004, Journal of Structural and Functional Genomics.

[42]  J. Haldane,et al.  The Part Played by Recurrent Mutation in Evolution , 1933, The American Naturalist.

[43]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[44]  K. Wolfe Evolutionary Genomics: Yeasts Accelerate beyond BLAST , 2004, Current Biology.

[45]  S. Baldauf,et al.  The Deep Roots of Eukaryotes , 2003, Science.

[46]  E. Koonin,et al.  The role of lineage-specific gene family expansion in the evolution of eukaryotes. , 2002, Genome research.

[47]  K. H. Wolfe,et al.  Eukaryote genome duplication - where's the evidence? , 1998, Current opinion in genetics & development.

[48]  E. Koonin,et al.  Selection in the evolution of gene duplications , 2002, Genome Biology.

[49]  Geoffrey J. Barton,et al.  JPred : a consensus secondary structure prediction server , 1999 .

[50]  H Philippe,et al.  Where is the root of the universal tree of life? , 1999, BioEssays : news and reviews in molecular, cellular and developmental biology.

[51]  T. Cavalier-smith,et al.  The neomuran origin of archaebacteria, the negibacterial root of the universal tree and bacterial megaclassification. , 2002, International journal of systematic and evolutionary microbiology.

[52]  A. Force,et al.  Preservation of duplicate genes by complementary, degenerative mutations. , 1999, Genetics.

[53]  W. Doolittle You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes. , 1998, Trends in genetics : TIG.

[54]  Austin L. Hughes,et al.  Evolution of the proteasome components , 1997, Immunogenetics.

[55]  E V Koonin,et al.  Lineage-specific gene expansions in bacterial and archaeal genomes. , 2001, Genome research.

[56]  Eugene V Koonin,et al.  Monophyly of class I aminoacyl tRNA synthetase, USPA, ETFP, photolyase, and PP‐ATPase nucleotide‐binding domains: implications for protein evolution in the RNA world , 2002, Proteins.

[57]  R. Veitia,et al.  Nonlinear effects in macromolecular assembly and dosage sensitivity. , 2003, Journal of theoretical biology.

[58]  E. Koonin,et al.  Comparative genomics of archaea: how much have we learned in six years, and what's next? , 2003, Genome Biology.

[59]  H. Muller The origination of chromatin deficiencies as minute deletions subject to insertion elsewhere , 1935, Genetica.

[60]  A. Hughes,et al.  Evolution of duplicate genes in a tetraploid animal, Xenopus laevis. , 1993, Molecular biology and evolution.

[61]  Masasuke Yoshida,et al.  Evolution of the vacuolar H+-ATPase: implications for the origin of eukaryotes. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[62]  E. Koonin,et al.  Coelomata and not Ecdysozoa: evidence from genome-wide phylogenetic analysis. , 2003, Genome research.

[63]  Rory A. Fisher,et al.  The Possible Modification of the Response of the Wild Type to Recurrent Mutations , 1928, The American Naturalist.

[64]  J. Andersson,et al.  Phylogenetic Analyses of Diplomonad Genes Reveal Frequent Lateral Gene Transfers Affecting Eukaryotes , 2003, Current Biology.

[65]  Radhey S. Gupta Evolution of the chaperonin families (HSP60, HSP 10 and TCP‐1) of proteins and the origin of eukaryotic cells , 1995, Molecular microbiology.

[66]  Michael Y. Galperin,et al.  The COG database: a tool for genome-scale analysis of protein functions and evolution , 2000, Nucleic Acids Res..

[67]  M. Lynch,et al.  The evolutionary fate and consequences of duplicate genes. , 2000, Science.

[68]  A. Force,et al.  The probability of duplicate gene preservation by subfunctionalization. , 2000, Genetics.

[69]  S. Osawa,et al.  Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[70]  Christian E. V. Storm,et al.  Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. , 2001, Journal of molecular biology.

[71]  Russell F. Doolittle,et al.  Intron Distribution in Ancient Paralogs Supports Random Insertion and Not Random Loss , 1997, Journal of Molecular Evolution.

[72]  Jianzhi Zhang,et al.  Rapid Subfunctionalization Accompanied by Prolonged and Substantial Neofunctionalization in Duplicate Gene Evolution , 2005, Genetics.

[73]  C. Brown,et al.  Multiple duplications of yeast hexose transport genes in response to selection in a glucose-limited environment. , 1998, Molecular biology and evolution.

[74]  Eörs Szathmáry,et al.  The Major Transitions in Evolution , 1997 .

[75]  A. Hughes,et al.  The temporal distribution of gene duplication events in a set of highly conserved human gene families. , 2003, Molecular biology and evolution.

[76]  James R. Brown,et al.  Archaea and the prokaryote-to-eukaryote transition. , 1997, Microbiology and molecular biology reviews : MMBR.

[77]  Temple F. Smith,et al.  Comparison of the complete protein sets of worm and yeast: orthology and divergence. , 1998, Science.

[78]  Michael Y. Galperin,et al.  Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes , 2003, BMC Evolutionary Biology.

[79]  M. Hochstrasser,et al.  Evolution and function of ubiquitin-like protein-conjugation systems , 2000, Nature Cell Biology.

[80]  W. Doolittle,et al.  How big is the iceberg of which organellar genes in nuclear genomes are but the tip? , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.