The primary divisions of life: a phylogenomic approach employing composition-heterogeneous methods

The three-domains tree, which depicts eukaryotes and archaebacteria as monophyletic sister groups, is the dominant model for early eukaryotic evolution. By contrast, the ‘eocyte hypothesis’, where eukaryotes are proposed to have originated from within the archaebacteria as sister to the Crenarchaeota (also called the eocytes), has been largely neglected in the literature. We have investigated support for these two competing hypotheses from molecular sequence data using methods that attempt to accommodate the across-site compositional heterogeneity and across-tree compositional and rate matrix heterogeneity that are manifest features of these data. When ribosomal RNA genes were analysed using standard methods that do not adequately model these kinds of heterogeneity, the three-domains tree was supported. However, this support was eroded or lost when composition-heterogeneous models were used, with concomitant increase in support for the eocyte tree for eukaryotic origins. Analysis of combined amino acid sequences from 41 protein-coding genes supported the eocyte tree, whether or not composition-heterogeneous models were used. The possible effects of substitutional saturation of our data were examined using simulation; these results suggested that saturation is delayed by among-site rate variation in the sequences, and that phylogenetic signal for ancient relationships is plausibly present in these data.

[1]  J A Lake,et al.  Evidence that eukaryotes and eocyte prokaryotes are immediate relatives. , 1992, Science.

[2]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[3]  N. Pace,et al.  The genetic core of the universal ancestor. , 2003, Genome research.

[4]  F. Delsuc,et al.  The timing of eukaryotic evolution: does a relaxed molecular clock reconcile proteins and fossils? , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[5]  O. Kandler,et al.  Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[6]  T. Cavalier-smith,et al.  Rooting the tree of life by transition analyses , 2006, Biology Direct.

[7]  R. Schnabel,et al.  Archaebacteria and the origin of the eukaryotic cytoplasm. , 1985, Current topics in microbiology and immunology.

[8]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[9]  K. Holsinger,et al.  Polytomies and Bayesian phylogenetic inference. , 2005, Systematic biology.

[10]  N. McCarthy,et al.  Time to Change , 2017 .

[11]  S. Whelan,et al.  A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. , 2001, Molecular biology and evolution.

[12]  S. Harris,et al.  The archaebacterial origin of eukaryotes , 2008, Proceedings of the National Academy of Sciences.

[13]  P. Forterre,et al.  The Rooting of the Universal Tree of Life Is Not Reliable , 1999, Journal of Molecular Evolution.

[14]  Kazutaka Katoh,et al.  Genetic Algorithm-Based Maximum-Likelihood Analysis for Molecular Phylogeny , 2001, Journal of Molecular Evolution.

[15]  H. Philippe,et al.  A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. , 2004, Molecular biology and evolution.

[16]  C. Woese On the evolution of cells , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[18]  W. Doolittle,et al.  Microsporidia are related to Fungi: evidence from the largest subunit of RNA polymerase II and other proteins. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[19]  J D Palmer,et al.  The root of the universal tree and the origin of eukaryotes based on elongation factor phylogeny. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Jacqueline A. Servin,et al.  Evidence for a gram-positive, eubacterial root of the tree of life. , 2007, Molecular biology and evolution.

[21]  J. Lake,et al.  Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Jonathan P. Bollback,et al.  Bayesian model adequacy and choice in phylogenetics. , 2002, Molecular biology and evolution.

[23]  M. Newton Approximate Bayesian-inference With the Weighted Likelihood Bootstrap , 1994 .

[24]  A. Knoll,et al.  Morphological and ecological complexity in early eukaryotic ecosystems , 2001, Nature.

[25]  J. Peter Gogarten,et al.  Ancient gene duplications and the root(s) of the tree of life , 2005, Protoplasma.

[26]  T. Embley,et al.  Trichomonas hydrogenosomes contain the NADH dehydrogenase module of mitochondrial complex I , 2004, Nature.

[27]  Jacqueline A. Servin,et al.  Evidence for a new root of the tree of life. , 2008, Systematic biology.

[28]  Radford M. Neal Bayesian Mixture Modeling , 1992 .

[29]  John P. Huelsenbeck,et al.  MrBayes 3: Bayesian phylogenetic inference under mixed models , 2003, Bioinform..

[30]  W. Martin,et al.  Eukaryotic evolution, changes and challenges , 2006, Nature.

[31]  N. Pace,et al.  Perspectives on archaeal diversity, thermophily and monophyly from environmental rRNA sequences. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[32]  E. Koonin,et al.  The Deep Archaeal Roots of Eukaryotes , 2008, Molecular biology and evolution.

[33]  M. Gouy,et al.  Accounting for evolutionary rate variation among sequence sites consistently changes universal phylogenies deduced from rRNA and protein-coding genes. , 1999, Molecular phylogenetics and evolution.

[34]  Gary J Olsen,et al.  Archaeal Genomics: An Overview , 1997, Cell.

[35]  Michael J. Stanhope,et al.  Universal trees based on large combined protein sequence data sets , 2001, Nature Genetics.

[36]  James A. Lake,et al.  Origin of the eukaryotic nucleus determined by rate-invariant analysis of rRNA sequences , 1988, Nature.

[37]  Peter G Foster,et al.  Modeling compositional heterogeneity. , 2004, Systematic biology.

[38]  H. Philippe,et al.  Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model , 2007, BMC Evolutionary Biology.

[39]  Z. Yang,et al.  On the use of nucleic acid sequences to infer early branchings in the tree of life. , 1995, Molecular biology and evolution.

[40]  M. Steel,et al.  Recovering evolutionary trees under a more realistic model of sequence evolution. , 1994, Molecular biology and evolution.

[41]  T. Cavalier-smith The phagotrophic origin of eukaryotes and phylogenetic classification of Protozoa. , 2002, International journal of systematic and evolutionary microbiology.

[42]  S. Ho,et al.  Tracing the decay of the historical signal in biological sequence data. , 2004, Systematic biology.