Archaeal “Dark Matter” and the Origin of Eukaryotes

Current hypotheses about the history of cellular life are mainly based on analyses of cultivated organisms, but these represent only a small fraction of extant biodiversity. The sequencing of new environmental lineages therefore provides an opportunity to test, revise, or reject existing ideas about the tree of life and the origin of eukaryotes. According to the textbook three domains hypothesis, the eukaryotes emerge as the sister group to a monophyletic Archaea. However, recent analyses incorporating better phylogenetic models and an improved sampling of the archaeal domain have generally supported the competing eocyte hypothesis, in which core genes of eukaryotic cells originated from within the Archaea, with important implications for eukaryogenesis. Given this trend, it was surprising that a recent analysis incorporating new genomes from uncultivated Archaea recovered a strongly supported three domains tree. Here, we show that this result was due in part to the use of a poorly fitting phylogenetic model and also to the inclusion by an automated pipeline of genes of putative bacterial origin rather than nucleocytosolic versions for some of the eukaryotes analyzed. When these issues were resolved, analyses including the new archaeal lineages placed core eukaryotic genes within the Archaea. These results are consistent with a number of recent studies in which improved archaeal sampling and better phylogenetic models agree in supporting the eocyte tree over the three domains hypothesis.

[1]  Jonathan P. Bollback,et al.  Bayesian model adequacy and choice in phylogenetics. , 2002, Molecular biology and evolution.

[2]  Jacqueline A. Servin,et al.  Genome beginnings: rooting the tree of life , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[3]  Daniel Stubbs,et al.  PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. , 2013, Systematic biology.

[4]  Olivier Gascuel,et al.  Empirical profile mixture models for phylogenetic reconstruction , 2008, Bioinform..

[5]  Natalia N. Ivanova,et al.  Insights into the phylogeny and coding potential of microbial dark matter , 2013, Nature.

[6]  Lior Pachter,et al.  Fast Statistical Alignment , 2009, PLoS Comput. Biol..

[7]  M. Hattori,et al.  Insights into the evolution of Archaea and eukaryotic protein modifier systems revealed by the genome of a novel archaeal group , 2010, Nucleic acids research.

[8]  Nicolas Lartillot,et al.  PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating , 2009, Bioinform..

[9]  J. Gogarten,et al.  The effects of model choice and mitigating bias on the ribosomal tree of life. , 2013, Molecular phylogenetics and evolution.

[10]  T. Williams,et al.  An archaeal origin of eukaryotes supports only two primary domains of life , 2013, Nature.

[11]  J. Lake,et al.  Genomic evidence for two functionally distinct gene classes. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Holly M. Bik,et al.  PhyloSift: phylogenetic analysis of genomes and metagenomes , 2014, PeerJ.

[13]  W. Martin,et al.  The hydrogen hypothesis for the first eukaryote , 1998, Nature.

[14]  Thijs J. G. Ettema,et al.  The archaeal 'TACK' superphylum and the origin of eukaryotes. , 2011, Trends in microbiology.

[15]  S. Quake,et al.  Dissecting biological “dark matter” with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth , 2007, Proceedings of the National Academy of Sciences.

[16]  P. Forterre,et al.  Mesophilic crenarchaeota: proposal for a third archaeal phylum, the Thaumarchaeota , 2008, Nature Reviews Microbiology.

[17]  A. Graybeal,et al.  Is it better to add taxa or characters to a difficult phylogenetic problem? , 1998, Systematic biology.

[18]  T Martin Embley,et al.  The primary divisions of life: a phylogenomic approach employing composition-heterogeneous methods , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[19]  Tomasello,et al.  A congruent phylogenomic signal places eukaryotes within the Archaea , 2012, Proceedings of the Royal Society B: Biological Sciences.

[20]  H. Philippe,et al.  A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. , 2004, Molecular biology and evolution.

[21]  H. Philippe,et al.  Resolving Difficult Phylogenetic Questions: Why More Sequences Are Not Enough , 2011, PLoS biology.

[22]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[23]  Ramón Doallo,et al.  ProtTest 3: fast selection of best-fit models of protein evolution , 2011, Bioinform..

[24]  David Bryant,et al.  Genome Networks Root the Tree of Life between Prokaryotic Domains , 2010, Genome biology and evolution.

[25]  S. Harris,et al.  The archaebacterial origin of eukaryotes , 2008, Proceedings of the National Academy of Sciences.

[26]  Chuong B. Do,et al.  ProbCons: Probabilistic consistency-based multiple sequence alignment. , 2005, Genome research.

[27]  Masasuke Yoshida,et al.  Evolution of the vacuolar H+-ATPase: implications for the origin of eukaryotes. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[28]  E. Koonin,et al.  A korarchaeal genome reveals insights into the evolution of the Archaea , 2008, Proceedings of the National Academy of Sciences.

[29]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[30]  K. Katoh,et al.  MAFFT version 5: improvement in accuracy of multiple sequence alignment , 2005, Nucleic acids research.

[31]  S. Kelly,et al.  Archaeal phylogenomics provides evidence in support of a methanogenic origin of the Archaea and a thaumarchaeal origin for the eukaryotes , 2010, Proceedings of the Royal Society B: Biological Sciences.

[32]  O. Gascuel,et al.  An improved general amino acid replacement matrix. , 2008, Molecular biology and evolution.

[33]  M. Wagner,et al.  The Thaumarchaeota: an emerging view of their phylogeny and ecophysiology , 2011, Current opinion in microbiology.

[34]  P. Forterre,et al.  Phylogeny and evolution of the Archaea: one hundred genomes later. , 2011, Current opinion in microbiology.

[35]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[36]  J. Gatesy,et al.  The supermatrix approach to systematics. , 2007, Trends in ecology & evolution.

[37]  T. Cavalier-smith,et al.  Rooting the tree of life by transition analyses , 2006, Biology Direct.

[38]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[39]  S. Osawa,et al.  Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[40]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[41]  W. Doolittle,et al.  Evidence that eukaryotic triosephosphate isomerase is of alpha-proteobacterial origin. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Alexis Criscuolo,et al.  BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments , 2010, BMC Evolutionary Biology.

[43]  J. Lake,et al.  Eocytes: a new ribosome structure indicates a kingdom with a close relationship to eukaryotes. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[44]  O. Kandler,et al.  Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Erik L. L. Sonnhammer,et al.  Kalign – an accurate and fast multiple sequence alignment algorithm , 2005, BMC Bioinformatics.

[46]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[47]  J. Banfield,et al.  De novo metagenomic assembly reveals abundant novel major lineage of Archaea in hypersaline microbial communities , 2011, The ISME Journal.

[48]  H. Philippe,et al.  Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model , 2007, BMC Evolutionary Biology.

[49]  S. Giovannoni,et al.  The uncultured microbial majority. , 2003, Annual review of microbiology.

[50]  Philip Hugenholtz,et al.  Lineages of Acidophilic Archaea Revealed by Community Genomic Analysis , 2006, Science.