Integrative modeling of gene and genome evolution roots the archaeal tree of life

Significance The Archaea represent a primary domain of cellular life, play major roles in modern-day biogeochemical cycles, and are central to debates about the origin of eukaryotic cells. However, understanding their origins and evolutionary history is challenging because of the immense time spans involved. Here we apply a new approach that harnesses the information in patterns of gene family evolution to find the root of the archaeal tree and to resolve the metabolism of the earliest archaeal cells. Our approach robustly distinguishes between published rooting hypotheses, suggests that the first Archaea were anaerobes that may have fixed carbon via the Wood–Ljungdahl pathway, and quantifies the cumulative impact of horizontal transfer on archaeal genome evolution. A root for the archaeal tree is essential for reconstructing the metabolism and ecology of early cells and for testing hypotheses that propose that the eukaryotic nuclear lineage originated from within the Archaea; however, published studies based on outgroup rooting disagree regarding the position of the archaeal root. Here we constructed a consensus unrooted archaeal topology using protein concatenation and a multigene supertree method based on 3,242 single gene trees, and then rooted this tree using a recently developed model of genome evolution. This model uses evidence from gene duplications, horizontal transfers, and gene losses contained in 31,236 archaeal gene families to identify the most likely root for the tree. Our analyses support the monophyly of DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota, Nanohaloarchaea), a recently discovered cosmopolitan and genetically diverse lineage, and, in contrast to previous work, place the tree root between DPANN and all other Archaea. The sister group to DPANN comprises the Euryarchaeota and the TACK Archaea, including Lokiarchaeum, which our analyses suggest are monophyletic sister lineages. Metabolic reconstructions on the rooted tree suggest that early Archaea were anaerobes that may have had the ability to reduce CO2 to acetate via the Wood–Ljungdahl pathway. In contrast to proposals suggesting that genome reduction has been the predominant mode of archaeal evolution, our analyses infer a relatively small-genomed archaeal ancestor that subsequently increased in complexity via gene duplication and horizontal gene transfer.

[1]  C. Woese,et al.  Phylogenetic structure of the prokaryotic domain: The primary kingdoms , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[2]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[3]  M. O. Dayhoff,et al.  Origins of prokaryotes, eukaryotes, mitochondria, and chloroplasts. , 1978, Science.

[4]  S. Osawa,et al.  Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Masasuke Yoshida,et al.  Evolution of the vacuolar H+-ATPase: implications for the origin of eukaryotes. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[6]  O. Kandler,et al.  Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[7]  M. Ragan Phylogenetic inference based on matrix representation of trees. , 1992, Molecular phylogenetics and evolution.

[8]  B. Baum Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees , 1992 .

[9]  M. Adams,et al.  Hydrogenase of the hyperthermophile Pyrococcus furiosus is an elemental sulfur reductase or sulfhydrogenase: evidence for a sulfur-reducing hydrogenase ancestor. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[10]  W. Doolittle,et al.  Root of the universal tree of life based on ancient aminoacyl-tRNA synthetase gene duplications. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Akiyasu C. Yoshizawa,et al.  KAAS: an automatic genome annotation and pathway reconstruction server , 2007, Environmental health perspectives.

[12]  H. Philippe,et al.  How good are deep phylogenetic trees? , 1998, Current opinion in genetics & development.

[13]  A. Graybeal,et al.  Is it better to add taxa or characters to a difficult phylogenetic problem? , 1998, Systematic biology.

[14]  J. Gogarten,et al.  Horizontal gene transfer: pitfalls and promises. , 1999 .

[15]  J. Lake,et al.  Horizontal gene transfer among genomes: the complexity hypothesis. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[16]  W. Doolittle,et al.  Microsporidia are related to Fungi: evidence from the largest subunit of RNA polymerase II and other proteins. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Hidetoshi Shimodaira An approximately unbiased test of phylogenetic tree selection. , 2002, Systematic biology.

[18]  Harald Huber,et al.  A new phylum of Archaea represented by a nanosized hyperthermophilic symbiont , 2002, Nature.

[19]  Dieter Söll,et al.  The genome of Nanoarchaeum equitans: Insights into early archaeal evolution and derived parasitism , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[20]  W. Doolittle,et al.  How big is the iceberg of which organellar genes in nuclear genomes are but the tip? , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[21]  A. Bekker,et al.  Dating the rise of atmospheric oxygen , 2004, Nature.

[22]  T. Embley,et al.  Trichomonas hydrogenosomes contain the NADH dehydrogenase module of mitochondrial complex I , 2004, Nature.

[23]  P. Forterre,et al.  Nanoarchaea: representatives of a novel archaeal phylum or a fast-evolving euryarchaeal lineage related to Thermococcales? , 2005, Genome Biology.

[24]  H. Philippe,et al.  A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. , 2004, Molecular biology and evolution.

[25]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[26]  J. Peter Gogarten,et al.  Ancient gene duplications and the root(s) of the tree of life , 2005, Protoplasma.

[27]  J. Bergsten A review of long‐branch attraction , 2005, Cladistics : the international journal of the Willi Hennig Society.

[28]  Keita Yamada,et al.  Evidence from fluid inclusions for microbial methanogenesis in the early Archaean era , 2006, Nature.

[29]  W. Martin,et al.  Eukaryotic evolution, changes and challenges , 2006, Nature.

[30]  D. Hillis,et al.  Resolution of phylogenetic conflict in large data sets by increased taxon sampling. , 2006, Systematic biology.

[31]  Jacqueline A. Servin,et al.  Evidence that the root of the tree of life is not within the Archaea. , 2006, Molecular biology and evolution.

[32]  H. Philippe,et al.  Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model , 2007, BMC Evolutionary Biology.

[33]  W. Martin,et al.  Ancestral genome sizes specify the minimum rate of lateral gene transfer during prokaryote evolution , 2007, Proceedings of the National Academy of Sciences.

[34]  Davide Pisani,et al.  Supertrees disentangle the chimerical origin of eukaryotic genomes. , 2007, Molecular biology and evolution.

[35]  Anne-Béatrice Dufour,et al.  The ade4 Package: Implementing the Duality Diagram for Ecologists , 2007 .

[36]  Edward Susko,et al.  On reduced amino acid alphabets for phylogenetic inference. , 2007, Molecular biology and evolution.

[37]  Mark Wilkinson,et al.  Of clades and clans: terms for phylogenetic relationships in unrooted trees. , 2007, Trends in ecology & evolution.

[38]  D. Penny,et al.  The problem of rooting rapid radiations. , 2007, Molecular biology and evolution.

[39]  E. Koonin,et al.  A korarchaeal genome reveals insights into the evolution of the Archaea , 2008, Proceedings of the National Academy of Sciences.

[40]  Gaston H. Gonnet,et al.  Algorithm of OMA for large-scale orthology inference , 2008, BMC bioinformatics.

[41]  S. Harris,et al.  The archaebacterial origin of eukaryotes , 2008, Proceedings of the National Academy of Sciences.

[42]  P. Forterre,et al.  Mesophilic crenarchaeota: proposal for a third archaeal phylum, the Thaumarchaeota , 2008, Nature Reviews Microbiology.

[43]  D. Hillis,et al.  Taxon sampling and the accuracy of phylogenetic analyses , 2008 .

[44]  Michael Wagner,et al.  A moderately thermophilic ammonia-oxidizing crenarchaeote from a hot spring , 2008, Proceedings of the National Academy of Sciences.

[45]  Nicolas Lartillot,et al.  PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating , 2009, Bioinform..

[46]  Jacqueline A. Servin,et al.  Genome beginnings: rooting the tree of life , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[47]  István Miklós,et al.  Streamlining and Large Ancestral Genomes in Archaea Inferred with a Phylogenetic Birth-and-Death Model , 2009, Molecular biology and evolution.

[48]  T Martin Embley,et al.  The primary divisions of life: a phylogenomic approach employing composition-heterogeneous methods , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[49]  David Bryant,et al.  Genome Networks Root the Tree of Life between Prokaryotic Domains , 2010, Genome biology and evolution.

[50]  G. Fuchs,et al.  Fructose 1,6-bisphosphate aldolase/phosphatase may be an ancestral gluconeogenic enzyme , 2010, Nature.

[51]  Alexis Criscuolo,et al.  BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments , 2010, BMC Evolutionary Biology.

[52]  M. Hattori,et al.  Insights into the evolution of Archaea and eukaryotic protein modifier systems revealed by the genome of a novel archaeal group , 2010, Nucleic acids research.

[53]  M. Wagner,et al.  The Thaumarchaeota: an emerging view of their phylogeny and ecophysiology , 2011, Current opinion in microbiology.

[54]  Thijs J. G. Ettema,et al.  The archaeal 'TACK' superphylum and the origin of eukaryotes. , 2011, Trends in microbiology.

[55]  H. Philippe,et al.  Resolving Difficult Phylogenetic Questions: Why More Sequences Are Not Enough , 2011, PLoS biology.

[56]  D. Moreira,et al.  The early evolution of lipid membranes and the three domains of life , 2012, Nature Reviews Microbiology.

[57]  Sophie S Abby,et al.  Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations , 2012, Proceedings of the National Academy of Sciences.

[58]  A. Spang,et al.  The genome of the ammonia-oxidizing Candidatus Nitrososphaera gargensis: insights into metabolic versatility and environmental adaptations. , 2012, Environmental microbiology.

[59]  A. Janssen,et al.  Acquisition of 1,000 eubacterial genes physiologically transformed a methanogen at the origin of Haloarchaea , 2012, Proceedings of the National Academy of Sciences.

[60]  Tomasello,et al.  A congruent phylogenomic signal places eukaryotes within the Archaea , 2012, Proceedings of the Royal Society B: Biological Sciences.

[61]  Alexei J Drummond,et al.  Guided tree topology proposals for Bayesian phylogenetic inference. , 2012, Systematic biology.

[62]  Sophie S Abby,et al.  Lateral gene transfer as a support for the tree of life , 2012, Proceedings of the National Academy of Sciences.

[63]  M. Gouy,et al.  The molecular signal for the adaptation to cold temperature during early life on Earth , 2013, Biology Letters.

[64]  H. Philippe,et al.  Impact of missing data on phylogenies inferred from empirical phylogenomic data sets. , 2013, Molecular biology and evolution.

[65]  J. McInerney,et al.  Heterogeneous Models Place the Root of the Placental Mammal Phylogeny , 2013, Molecular biology and evolution.

[66]  Thijs J. G. Ettema,et al.  Close Encounters of the Third Domain: The Emerging Genomic View of Archaeal Diversity and Evolution , 2013, Archaea.

[67]  Nicolas C. Rochette,et al.  Bio++: efficient extensible libraries and tools for computational molecular evolution. , 2013, Molecular biology and evolution.

[68]  M. Gouy,et al.  A Branch-Heterogeneous Model of Protein Evolution for Efficient Inference of Ancestral Sequences , 2013, Systematic biology.

[69]  Natalia N. Ivanova,et al.  Insights into the phylogeny and coding potential of microbial dark matter , 2013, Nature.

[70]  T. Williams,et al.  An archaeal origin of eukaryotes supports only two primary domains of life , 2013, Nature.

[71]  J. Gogarten,et al.  The effects of model choice and mitigating bias on the ribosomal tree of life. , 2013, Molecular phylogenetics and evolution.

[72]  B. Boussau,et al.  Efficient Exploration of the Space of Reconciled Gene Trees , 2013, Systematic biology.

[73]  Bret Larget,et al.  The estimation of tree posterior probabilities using conditional clade probability distributions. , 2013, Systematic biology.

[74]  Daniel Stubbs,et al.  PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. , 2013, Systematic biology.

[75]  Tom A. Williams,et al.  Archaeal “Dark Matter” and the Origin of Eukaryotes , 2014, Genome biology and evolution.

[76]  S. Baldauf,et al.  An Alternative Root for the Eukaryote Tree of Life , 2014, Current Biology.

[77]  A. Pomiankowski,et al.  A Bioenergetic Basis for Membrane Divergence in Archaea and Bacteria , 2014, PLoS biology.

[78]  Filipa L. Sousa,et al.  Biochemical fossils of the ancient transition from geoenergetics to bioenergetics in prokaryotic one carbon compound metabolism. , 2014, Biochimica et biophysica acta.

[79]  Richard J Boys,et al.  Bayesian modelling of compositional heterogeneity in molecular phylogenetics , 2014, Statistical applications in genetics and molecular biology.

[80]  Purificación López-García,et al.  Rooting the Domain Archaea by Phylogenomic Analysis Supports the Foundation of the New Kingdom Proteoarchaeota , 2014, Genome biology and evolution.

[81]  Luis R. Comolli,et al.  Inter-species interconnections in acid mine drainage microbial communities , 2014, Front. Microbiol..

[82]  Erin A. Becker,et al.  Phylogenetically Driven Sequencing of Extremely Halophilic Archaea Reveals Strategies for Static and Dynamic Osmo-response , 2014, PLoS genetics.

[83]  F. Rodríguez-Valera,et al.  Pangenome Evidence for Extensive Interdomain Horizontal Transfer Affecting Lineage Core and Shell Genes in Uncultured Planktonic Thaumarchaeota and Euryarchaeota , 2014, Genome biology and evolution.

[84]  Thijs J. G. Ettema,et al.  Complex archaea that bridge the gap between prokaryotes and eukaryotes , 2015, Nature.

[85]  R. Stepanauskas,et al.  Insights into the metabolism, lifestyle and putative evolutionary history of the novel archaeal phylum ‘Diapherotrites’ , 2014, The ISME Journal.

[86]  Romain Derelle,et al.  Bacterial proteins pinpoint a single eukaryotic root , 2015, Proceedings of the National Academy of Sciences.

[87]  Lionel Guy,et al.  Exploring microbial dark matter to resolve the deep archaeal ancestry of eukaryotes , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[88]  Adrián A. Davín,et al.  Genome-scale phylogenetic analysis finds extensive gene transfer among fungi , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[89]  Kenneth H. Williams,et al.  Genomic Expansion of Domain Archaea Highlights Roles for Organisms from New Phyla in Anaerobic Carbon Cycling , 2015, Current Biology.

[90]  Donovan H. Parks,et al.  Methane metabolism in the archaeal phylum Bathyarchaeota revealed by genome-centric metagenomics , 2015, Science.

[91]  D. Moreira,et al.  Open Questions on the Origin of Eukaryotes. , 2015, Trends in ecology & evolution.

[92]  T. Williams,et al.  New substitution models for rooting phylogenetic trees , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[93]  S. Gribaldo,et al.  The two-domain tree of life is linked to a new root for the Archaea , 2015, Proceedings of the National Academy of Sciences.

[94]  Wasiu A. Akanni,et al.  Horizontal gene flow from Eubacteria to Archaebacteria and what it means for our understanding of eukaryogenesis , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[95]  Kira S. Makarova,et al.  Archaeal Clusters of Orthologous Genes (arCOGs): An Update and Application for Analysis of Shared Features between Thermococcales, Methanococcales, and Methanobacteriales , 2015, Life.

[96]  Filipa L. Sousa,et al.  Origins of major archaeal clades correspond to gene acquisitions from bacteria , 2014, Nature.

[97]  H. Philippe,et al.  Genomic data do not support comb jellies as the sister group to all other animals , 2015, Proceedings of the National Academy of Sciences.

[98]  P. Deschamps,et al.  Bacterial gene import and mesophilic adaptation in archaea , 2015, Nature Reviews Microbiology.

[99]  Filipa L. Sousa,et al.  Lokiarchaeon is hydrogen dependent , 2016, Nature Microbiology.

[100]  R. Ortiz-Álvarez,et al.  High occurrence of Pacearchaeota and Woesearchaeota (Archaea superphylum DPANN) in the surface waters of oligotrophic high-altitude lakes. , 2016, Environmental microbiology reports.

[101]  Filipa L. Sousa,et al.  One step beyond a ribosome: The ancient anaerobic core , 2016, Biochimica et biophysica acta.

[102]  B. Baker,et al.  Genomic inference of the metabolism of cosmopolitan subsurface Archaea, Hadesarchaea , 2016, Nature Microbiology.

[103]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[104]  Brian C. Thomas,et al.  A new view of the tree of life , 2016, Nature Microbiology.

[105]  M. Gouy,et al.  Gene Acquisitions from Bacteria at the Origins of Major Archaeal Clades Are Vastly Overestimated , 2015, Molecular biology and evolution.

[106]  Filipa L. Sousa,et al.  The physiology and habitat of the last universal common ancestor , 2016, Nature Microbiology.

[107]  Thijs J. G. Ettema,et al.  Asgard archaea illuminate the origin of eukaryotic cellular complexity , 2017, Nature.

[108]  J. Damsté,et al.  Phylogenomic analysis of lipid biosynthetic genes of Archaea shed light on the ‘lipid divide’ , 2017, Environmental microbiology.