The Double-Stranded DNA Virosphere as a Modular Hierarchical Network of Gene Sharing

ABSTRACT Virus genomes are prone to extensive gene loss, gain, and exchange and share no universal genes. Therefore, in a broad-scale study of virus evolution, gene and genome network analyses can complement traditional phylogenetics. We performed an exhaustive comparative analysis of the genomes of double-stranded DNA (dsDNA) viruses by using the bipartite network approach and found a robust hierarchical modularity in the dsDNA virosphere. Bipartite networks consist of two classes of nodes, with nodes in one class, in this case genomes, being connected via nodes of the second class, in this case genes. Such a network can be partitioned into modules that combine nodes from both classes. The bipartite network of dsDNA viruses includes 19 modules that form 5 major and 3 minor supermodules. Of these modules, 11 include tailed bacteriophages, reflecting the diversity of this largest group of viruses. The module analysis quantitatively validates and refines previously proposed nontrivial evolutionary relationships. An expansive supermodule combines the large and giant viruses of the putative order “Megavirales” with diverse moderate-sized viruses and related mobile elements. All viruses in this supermodule share a distinct morphogenetic tool kit with a double jelly roll major capsid protein. Herpesviruses and tailed bacteriophages comprise another supermodule, held together by a distinct set of morphogenetic proteins centered on the HK97-like major capsid protein. Together, these two supermodules cover the great majority of currently known dsDNA viruses. We formally identify a set of 14 viral hallmark genes that comprise the hubs of the network and account for most of the intermodule connections. IMPORTANCE Viruses and related mobile genetic elements are the dominant biological entities on earth, but their evolution is not sufficiently understood and their classification is not adequately developed. The key reason is the characteristic high rate of virus evolution that involves not only sequence change but also extensive gene loss, gain, and exchange. Therefore, in the study of virus evolution on a large scale, traditional phylogenetic approaches have limited applicability and have to be complemented by gene and genome network analyses. We applied state-of-the art methods of such analysis to reveal robust hierarchical modularity in the genomes of double-stranded DNA viruses. Some of the identified modules combine highly diverse viruses infecting bacteria, archaea, and eukaryotes, in support of previous hypotheses on direct evolutionary relationships between viruses from the three domains of cellular life. We formally identify a set of 14 viral hallmark genes that hold together the genomic network. Viruses and related mobile genetic elements are the dominant biological entities on earth, but their evolution is not sufficiently understood and their classification is not adequately developed. The key reason is the characteristic high rate of virus evolution that involves not only sequence change but also extensive gene loss, gain, and exchange. Therefore, in the study of virus evolution on a large scale, traditional phylogenetic approaches have limited applicability and have to be complemented by gene and genome network analyses. We applied state-of-the art methods of such analysis to reveal robust hierarchical modularity in the genomes of double-stranded DNA viruses. Some of the identified modules combine highly diverse viruses infecting bacteria, archaea, and eukaryotes, in support of previous hypotheses on direct evolutionary relationships between viruses from the three domains of cellular life. We formally identify a set of 14 viral hallmark genes that hold together the genomic network.

[1]  J. Fuhrman Marine viruses and their biogeochemical and ecological effects , 1999, Nature.

[2]  Martin G. Everett,et al.  Network analysis of 2-mode data , 1997 .

[3]  Florian Maumus,et al.  Plant genomes enclose footprints of past infections by giant virus relatives , 2014, Nature Communications.

[4]  P. Forterre,et al.  The Great Billion‐year War between Ribosome‐ and Capsid‐encoding Organisms (Cells and Viruses) as the Major Source of Evolutionary Novelties , 2009, Annals of the New York Academy of Sciences.

[5]  Eduardo Corel,et al.  Network-Thinking: Graphs to Analyze Microbial Complexity and Evolution , 2016, Trends in microbiology.

[6]  Eugene V Koonin,et al.  New dimensions of the virus world discovered through metagenomics. , 2010, Trends in microbiology.

[7]  Eugene V. Koonin,et al.  Gene Frequency Distributions Reject a Neutral Model of Genome Evolution , 2013, Genome biology and evolution.

[8]  Luiz Fernando Bittencourt,et al.  MODULAR: Software for the Autonomous Computation of Modularity in Large Network Sets , 2013, ArXiv.

[9]  John Vu,et al.  Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity , 2015, eLife.

[10]  Evelien M. Adriaenssens,et al.  Taxonomy of prokaryotic viruses: update from the ICTV bacterial and archaeal viruses subcommittee , 2015, Archives of Virology.

[11]  M. Krupovic,et al.  Does the evolution of viral polymerases reflect the origin and evolution of viruses? , 2009, Nature Reviews Microbiology.

[12]  C. Suttle Viruses in the sea , 2005, Nature.

[13]  Stéphane Audic,et al.  Testing ecological theories with sequence similarity networks: marine ciliates exhibit similar geographic dispersal patterns as multicellular organisms , 2015, BMC Biology.

[14]  E. Thiry,et al.  The order Herpesvirales , 2008, Archives of Virology.

[15]  M. Barber Modularity and community detection in bipartite networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  E. Koonin,et al.  The complexity of the virus world , 2009, Nature Reviews Microbiology.

[17]  E. Koonin,et al.  A novel group of diverse Polinton-like viruses discovered by metagenome analysis , 2015, BMC Biology.

[18]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[19]  J. Vlak,et al.  The genome of Oryctes rhinoceros nudivirus provides novel insight into the evolution of nuclear arthropod-specific large circular double-stranded DNA viruses , 2011, Virus Genes.

[20]  E. Koonin,et al.  A virocentric perspective on the evolution of life , 2013, Current Opinion in Virology.

[21]  E. Moreno,et al.  Is Brucella an enteric pathogen? , 2009, Nature Reviews Microbiology.

[22]  David Alvarez-Ponce,et al.  Gene similarity networks provide tools for understanding eukaryote origins and evolution , 2013, Proceedings of the National Academy of Sciences.

[23]  Eugene V Koonin,et al.  On the Origin of Cells and Viruses , 2009, Annals of the New York Academy of Sciences.

[24]  Eugene V. Koonin,et al.  Virus World as an Evolutionary Network of Viruses and Capsidless Selfish Elements , 2014, Microbiology and Molecular Reviews.

[25]  Roger Guimerà,et al.  Module identification in bipartite and directed networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  E. Holmes What Does Virus Evolution Tell Us about Virus Origins? , 2011, Journal of Virology.

[27]  Marta C. González,et al.  Cycles and clustering in bipartite networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Nicola K. Petty,et al.  Taxonomy of prokaryotic viruses: 2017 update from the ICTV Bacterial and Archaeal Viruses Subcommittee , 2016, Archives of Virology.

[29]  E. Koonin,et al.  Origin of giant viruses from smaller DNA viruses not from a fourth domain of cellular life. , 2014, Virology.

[30]  E. Koonin,et al.  Origins and evolution of viruses of eukaryotes: The ultimate modularity , 2015, Virology.

[31]  Yun Li,et al.  Dynamic modular architecture of protein-protein interaction networks beyond the dichotomy of ‘date' and ‘party' hubs , 2013, Scientific Reports.

[32]  E. Koonin,et al.  Polintons: a hotbed of eukaryotic virus, transposon and plasmid evolution , 2014, Nature Reviews Microbiology.

[33]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[34]  Steven Kelk,et al.  Networks: expanding evolutionary thinking. , 2013, Trends in genetics : TIG.

[35]  E. Koonin,et al.  The ancient Virus World and evolution of cells , 2006, Biology Direct.

[36]  Eric Bapteste,et al.  Extensive Gene Remodeling in the Viral World: New Evidence for Nongradual Evolution in the Mobilome Network , 2014, Genome biology and evolution.

[37]  W. Chiu,et al.  Seeing the Portal in Herpes Simplex Virus Type 1 B Capsids , 2010, Journal of Virology.

[38]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[39]  Hampapathalu Adimurthy Nagarajaram,et al.  Global versus local hubs in human protein-protein interaction network. , 2013, Journal of proteome research.

[40]  Sari Mattila,et al.  Chasing the Origin of Viruses: Capsid-Forming Genes as a Life-Saving Preadaptation within a Community of Early Replicators , 2015, PloS one.

[41]  O. R. Bininda‐Emonds,et al.  Nudivirus Genomics and Phylogeny , 2012 .

[42]  E. Koonin,et al.  A new family of hybrid virophages from an animal gut metagenome , 2015, Biology Direct.

[43]  D. Prangishvili Archaeal viruses: living fossils of the ancient virosphere? , 2015, Annals of the New York Academy of Sciences.

[44]  E. Koonin,et al.  Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world , 2008, Nucleic acids research.

[45]  M. Young,et al.  Characterization of the Archaeal Thermophile Sulfolobus Turreted Icosahedral Virus Validates an Evolutionary Link among Double-Stranded DNA Viruses from All Domains of Life , 2006, Journal of Virology.

[46]  M. Schmid,et al.  Structural similarities in DNA packaging and delivery apparatuses in Herpesvirus and dsDNA bacteriophages. , 2014, Current opinion in virology.

[47]  Gipsi Lima-Mendez,et al.  Reticulate representation of evolutionary and functional relationships between phage genomes. , 2008, Molecular biology and evolution.

[48]  Thijs J. G. Ettema,et al.  Complex archaea that bridge the gap between prokaryotes and eukaryotes , 2015, Nature.

[49]  C. Suttle Marine viruses — major players in the global ecosystem , 2007, Nature Reviews Microbiology.

[50]  E. Koonin The Logic of Chance: The Nature and Origin of Biological Evolution , 2011 .

[51]  D. Bamford Do viruses form lineages across different domains of life? , 2003, Research in microbiology.

[52]  Peer Bork,et al.  Orthologous Gene Clusters and Taxon Signature Genes for Viruses of Prokaryotes , 2012, Journal of bacteriology.

[53]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[54]  J. Jehle,et al.  Nudiviruses and other large, double-stranded circular DNA viruses of invertebrates: new insights on an old topic. , 2009, Journal of invertebrate pathology.

[55]  F. Rohwer,et al.  Viruses manipulate the marine environment , 2009, Nature.

[56]  Paulien Hogeweg,et al.  The Role of Complex Formation and Deleterious Mutations for the Stability of RNA-Like Replicator Systems , 2007, Journal of Molecular Evolution.

[57]  N. Grishin,et al.  Double‐stranded DNA bacteriophage prohead protease is homologous to herpesvirus protease , 2004, Protein science : a publication of the Protein Society.

[58]  Tal Dagan,et al.  Phylogenomic networks. , 2011, Trends in microbiology.

[59]  J. Maniloff,et al.  Virus taxonomy : eighth report of the International Committee on Taxonomy of Viruses , 2005 .

[60]  E. Koonin,et al.  Evolutionary genomics of archaeal viruses: unique viral genomes in the third domain of life. , 2006, Virus research.

[61]  Joshua S Weitz,et al.  A neutral theory of genome evolution and the frequency distribution of genes , 2012, BMC Genomics.

[62]  Natalya Yutin,et al.  Virophages, polintons, and transpovirons: a complex evolutionary network of diverse selfish genetic elements with different reproduction strategies , 2013, Virology Journal.

[63]  R. Hendrix,et al.  Bacteriophages with tails: chasing their origins and evolution. , 2003, Research in microbiology.

[64]  Thomas L. Madden,et al.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. , 2001, Nucleic acids research.

[65]  E. Koonin Archaeal ancestors of eukaryotes: not so elusive any more , 2015, BMC Biology.

[66]  A. Abd-Alla,et al.  Phylogeny and evolution of Hytrosaviridae. , 2013, Journal of invertebrate pathology.

[67]  M. Baker,et al.  Common Ancestry of Herpesviruses and Tailed DNA Bacteriophages , 2005, Journal of Virology.

[68]  P. Forterre,et al.  The major role of viruses in cellular evolution: facts and hypotheses. , 2013, Current opinion in virology.

[69]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[70]  Forest Rohwer,et al.  Global Phage Diversity , 2003, Cell.

[71]  M. Krupovic,et al.  Virus evolution: how far does the double β-barrel viral lineage extend? , 2008, Nature Reviews Microbiology.

[72]  Johannes Söding,et al.  Automatic Prediction of Protein 3D Structures by Probabilistic Multi-template Homology Modeling , 2015, PLoS Comput. Biol..

[73]  E. Koonin,et al.  “Megavirales”, a proposed new order for eukaryotic nucleocytoplasmic large DNA viruses , 2013, Archives of Virology.

[74]  G. Hatfull Mycobacteriophages: genes and genomes. , 2010, Annual review of microbiology.

[75]  Mart Krupovic,et al.  Genomics of Bacterial and Archaeal Viruses: Dynamics within the Prokaryotic Virosphere , 2011, Microbiology and Molecular Reviews.

[76]  Vladimir Batagelj,et al.  Centrality in Social Networks , 1993 .

[77]  M. Krupovic,et al.  Gammasphaerolipovirus, a newly proposed bacteriophage genus, unifies viruses of halophilic archaea and thermophilic bacteria within the novel family Sphaerolipoviridae , 2014, Archives of Virology.

[78]  D. Stuart,et al.  What does structure tell us about virus evolution? , 2005, Current opinion in structural biology.

[79]  M. Krupovic,et al.  Double-stranded DNA viruses: 20 families and only five different architectural principles for virion assembly. , 2011, Current opinion in virology.

[80]  Paulien Hogeweg,et al.  Evolutionary dynamics of RNA-like replicator systems: A bioinformatic approach to the origin of life. , 2012, Physics of life reviews.

[81]  W. Martin,et al.  Directed networks reveal genomic barriers and DNA repair bypasses to lateral gene transfer among prokaryotes. , 2011, Genome research.

[82]  R. Hendrix Jumbo bacteriophages. , 2009, Current topics in microbiology and immunology.

[83]  Gipsi Lima-Mendez,et al.  Analysis of the phage sequence space: the benefit of structured information. , 2007, Virology.

[84]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[85]  Graham F Hatfull,et al.  Bacteriophage genomics. , 2008, Current opinion in microbiology.

[86]  Kira S. Makarova,et al.  Archaeal Clusters of Orthologous Genes (arCOGs): An Update and Application for Analysis of Shared Features between Thermococcales, Methanococcales, and Methanobacteriales , 2015, Life.

[87]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[88]  SödingJohannes Protein homology detection by HMM--HMM comparison , 2005 .

[89]  R. Edwards,et al.  Viral metagenomics , 2005, Nature Reviews Microbiology.

[90]  M. Krupovic,et al.  Order to the Viral Universe , 2010, Journal of Virology.

[91]  E. Szathmáry,et al.  Group selection of early replicators and the origin of life. , 1987, Journal of theoretical biology.

[92]  D. Stuart,et al.  Bacteriophage P23-77 Capsid Protein Structures Reveal the Archetype of an Ancient Branch from a Major Virus Lineage , 2013, Structure.

[93]  A. Mushegian,et al.  Evolutionarily Conserved Orthologous Families in Phages Are Relatively Rare in Their Prokaryotic Hosts , 2011, Journal of bacteriology.

[94]  Natalya Yutin,et al.  Eukaryotic large nucleo-cytoplasmic DNA viruses: Clusters of orthologous genes and reconstruction of viral genome evolution , 2009, Virology Journal.

[95]  E. Koonin,et al.  Origin and Evolution of Eukaryotic Large Nucleo-Cytoplasmic DNA Viruses , 2010, Intervirology.

[96]  D. Raoult,et al.  Reclassification of Giant Viruses Composing a Fourth Domain of Life in the New Order Megavirales , 2012, Intervirology.

[97]  E. Koonin,et al.  Conservation of major and minor jelly-roll capsid proteins in Polinton (Maverick) transposons suggests that they are bona fide viruses , 2014, Biology Direct.

[98]  A. E. Hirsh,et al.  Evolutionary Rate in the Protein Interaction Network , 2002, Science.

[99]  Nicola G A Abrescia,et al.  Insight into the Assembly of Viruses with Vertical Single β-barrel Major Capsid Proteins. , 2015, Structure.

[100]  Graham F Hatfull,et al.  Mycobacteriophage Lysin B is a novel mycolylarabinogalactan esterase , 2009, Molecular microbiology.

[101]  Ricard V Solé,et al.  When metabolism meets topology: Reconciling metabolite and reaction networks , 2010, BioEssays : news and reviews in molecular, cellular and developmental biology.

[102]  M. Rossmann,et al.  Conservation of the capsid structure in tailed dsDNA bacteriophages: the pseudoatomic structure of phi29. , 2005, Molecular cell.

[103]  Marc C. Morais,et al.  Structure of the bacteriophage φ29 DNA packaging motor , 2000, Nature.

[104]  Tal Dagan,et al.  Trends and barriers to lateral gene transfer in prokaryotes. , 2011, Current opinion in microbiology.

[105]  Paulien Hogeweg,et al.  On the Origin of DNA Genomes: Evolution of the Division of Labor between Template and Catalyst in Model Replicator Systems , 2011, PLoS Comput. Biol..