Sequence similarity network reveals the imprints of major diversification events in the evolution of microbial life

Ancient transitions, such as between life that evolved in a reducing versus an oxidizing atmosphere precipitated by the Great Oxygenation Event (GOE) ca. 2.4 billion years ago, fundamentally altered the space in which prokaryotes could derive metabolic energy. Despite fundamental changes in Earth’s redox state, there are very few comprehensive, proteome-wide analyses about the effects of these changes on gene content and evolution. Here, using a pan-proteome sequence similarity network applied to broadly sampled lifestyles of 84 prokaryotes that were categorized into four different redox groups (i.e., methanogens, obligate anaerobes, facultative anaerobes, and obligate aerobes), we reconstructed the genetic inventory of major respiratory communities. We show that a set of putative core homologs that is highly conserved in prokaryotic proteomes is characterized by the loss of canonical network connections and low conductance that correlates with differences in respiratory phenotypes. We suggest these different network patterns observed for different respiratory communities could be explained by two major evolutionary diversification events in the history of microbial life. The first event (M) is a divergence between methanogenesis and other anaerobic lifestyles in prokaryotes (archaebacteria and eubacteria). The second diversification event (OX) is from anaerobic to aerobic lifestyles that left a proteome-wide footprint among prokaryotes. Additional analyses revealed that oxidoreductase evolution played a central role in these two diversification events. Distinct cofactor binding domains were frequently recombined, allowing these enzymes to utilize increasingly oxidized substrates with high specificity.

[1]  P. Falkowski,et al.  The cycling and redox state of nitrogen in the Archaean ocean , 2009 .

[2]  E. Delong,et al.  The Microbial Engines That Drive Earth's Biogeochemical Cycles , 2008, Science.

[3]  John B. Anderson,et al.  CDD: a Conserved Domain Database for protein classification , 2004, Nucleic Acids Res..

[4]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[5]  Donald E. Canfield,et al.  Isotopic evidence for microbial sulphate reduction in the early Archaean era , 2001, Nature.

[6]  M. Hijri,et al.  Mitochondrial Genome Rearrangements in Glomus Species Triggered by Homologous Recombination between Distinct mtDNA Haplotypes , 2013, Genome biology and evolution.

[7]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[8]  F. Lapointe,et al.  Of woods and webs: possible alternatives to the tree of life for studying genomic fluidity in E. coli , 2011, Biology Direct.

[9]  Samuel Karlin,et al.  Protein length in eukaryotic and prokaryotic proteomes , 2005, Nucleic acids research.

[10]  Nathan Linial,et al.  ProtoMap: automatic classification of protein sequences and hierarchy of protein families , 2000, Nucleic Acids Res..

[11]  Thomas E. Ferrin,et al.  Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies , 2009, PloS one.

[12]  D. Penny,et al.  Using ancestral sequences to uncover potential gene homologues. , 2003, Applied bioinformatics.

[13]  Ori Sasson,et al.  ProtoNet: hierarchical classification of the protein space , 2003, Nucleic Acids Res..

[14]  Darren A. Natale,et al.  The complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Masaru Tomita,et al.  Comprehensive Computational Analysis of Bacterial CRP/FNR Superfamily and Its Target Motifs Reveals Stepwise Evolution of Transcriptional Networks , 2013, Genome biology and evolution.

[16]  Pascal Lapierre,et al.  Estimating the size of the bacterial pan-genome. , 2009, Trends in genetics : TIG.

[17]  Lawrence A. David,et al.  Rapid evolutionary innovation during an Archaean genetic expansion , 2011, Nature.

[18]  Gustavo Caetano-Anollés,et al.  Protein domain structure uncovers the origin of aerobic metabolism and the rise of planetary oxygen. , 2012, Structure.

[19]  E. Roden,et al.  Fe, C, and O isotope compositions of banded iron formation carbonates demonstrate a major role for dissimilatory iron reduction in ~2.5 Ga marine environments , 2010 .

[20]  Tal Dagan,et al.  Modular networks and cumulative impact of lateral transfer in prokaryote genome evolution , 2008, Proceedings of the National Academy of Sciences.

[21]  Ambuj K. Singh,et al.  Integrating multi-attribute similarity networks for robust representation of the protein space , 2006, Bioinform..

[22]  W. Doolittle,et al.  The practice of classification and the theory of evolution, and what the demise of Charles Darwin's tree of life hypothesis means for both of them , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[23]  B. Snel,et al.  Toward Automatic Reconstruction of a Highly Resolved Tree of Life , 2006, Science.

[24]  D. Segrè,et al.  Supporting Online Material Materials and Methods Tables S1 and S2 References the Effect of Oxygen on Biochemical Networks and the Evolution of Complex Life , 2022 .

[25]  E. Roden,et al.  Iron isotopes constrain biologic and abiologic processes in banded iron formation genesis , 2008 .

[26]  Nathan Linial,et al.  ProtoNet: charting the expanding universe of protein sequences , 2013, Nature Biotechnology.

[27]  P. Falkowski,et al.  Discovering the electronic circuit diagram of life: structural relationships among transition metal binding sites in oxidoreductases , 2013, Philosophical Transactions of the Royal Society B: Biological Sciences.

[28]  Elliott Sober,et al.  Testing the hypothesis of common ancestry. , 2002, Journal of theoretical biology.

[29]  R. Korona,et al.  Gene dispensability. , 2011, Current opinion in biotechnology.

[30]  Andrei N. Lupas,et al.  CLANS: a Java application for visualizing protein families based on pairwise similarity , 2004, Bioinform..

[31]  Philip E. Bourne,et al.  Modern proteomes contain putative imprints of ancient shifts in trace metal geochemistry , 2006, Proceedings of the National Academy of Sciences.

[32]  J. Skolnick,et al.  How well is enzyme function conserved as a function of pairwise sequence identity? , 2003, Journal of molecular biology.

[33]  Owen White,et al.  The TIGRFAMs database of protein families , 2003, Nucleic Acids Res..

[34]  Eric Bapteste,et al.  EGN: a wizard for construction of gene and genome similarity networks , 2013, BMC Evolutionary Biology.

[35]  Philip E. Bourne,et al.  History of biological metal utilization inferred through phylogenomic analysis of protein structures , 2010, Proceedings of the National Academy of Sciences.

[36]  Michael Wagner,et al.  Phylogeny of Dissimilatory Sulfite Reductases Supports an Early Origin of Sulfate Respiration , 1998, Journal of bacteriology.

[37]  Eric Bapteste,et al.  Network analyses structure genetic diversity in independent genetic worlds , 2009, Proceedings of the National Academy of Sciences.

[38]  J. McInerney,et al.  A Pluralistic Account of Homology: Adapting the Models to the Data , 2013, Molecular biology and evolution.

[39]  Narmada Thanki,et al.  CDD: a Conserved Domain Database for the functional annotation of proteins , 2010, Nucleic Acids Res..

[40]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[41]  Tao Qin,et al.  The Impact of Oxygen on Metabolic Evolution: A Chemoinformatic Investigation , 2012, PLoS Comput. Biol..

[42]  Jian-Huang Lai,et al.  Phylogeny Inference Based on Spectral Graph Clustering , 2011, J. Comput. Biol..

[43]  Steven Kelk,et al.  Networks: expanding evolutionary thinking. , 2013, Trends in genetics : TIG.

[44]  Eric Bapteste,et al.  Extensive Gene Remodeling in the Viral World: New Evidence for Nongradual Evolution in the Mobilome Network , 2014, Genome biology and evolution.

[45]  G. Weinstock,et al.  Novel Bacterial Taxa in the Human Microbiome , 2012, PloS one.

[46]  James O. McInerney,et al.  Evolutionary analyses of non-genealogical bonds produced by introgressive descent , 2012, Proceedings of the National Academy of Sciences.

[47]  Alexander Schliep,et al.  Clustering Protein Sequences ? Structure Prediction by Transitive Homology , 2001, German Conference on Bioinformatics.