Viral dark matter and virus–host interactions resolved from publicly available microbial genomes

The ecological importance of viruses is now widely recognized, yet our limited knowledge of viral sequence space and virus–host interactions precludes accurate prediction of their roles and impacts. In this study, we mined publicly available bacterial and archaeal genomic data sets to identify 12,498 high-confidence viral genomes linked to their microbial hosts. These data augment public data sets 10-fold, provide first viral sequences for 13 new bacterial phyla including ecologically abundant phyla, and help taxonomically identify 7–38% of ‘unknown’ sequence space in viromes. Genome- and network-based classification was largely consistent with accepted viral taxonomy and suggested that (i) 264 new viral genera were identified (doubling known genera) and (ii) cross-taxon genomic recombination is limited. Further analyses provided empirical data on extrachromosomal prophages and coinfection prevalences, as well as evaluation of in silico virus–host linkage predictions. Together these findings illustrate the value of mining viral signal from microbial genomes. DOI: http://dx.doi.org/10.7554/eLife.08490.001

[1]  Siu-Ming Yiu,et al.  IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth , 2012, Bioinform..

[2]  Anders F. Andersson,et al.  Virus Population Dynamics and Acquired Virus Resistance in Natural Microbial Communities , 2008, Science.

[3]  Y. Mandel-Gutfreund,et al.  Cyanophage tRNAs may have a role in cross-infectivity of oceanic Prochlorococcus and Synechococcus hosts , 2011, The ISME Journal.

[4]  Sergey Koren,et al.  Single-cell genomics-based analysis of virus–host interactions in marine surface bacterioplankton , 2015, The ISME Journal.

[5]  M. Davis,et al.  Biochemical activities of the ParA partition protein of the P1 plasmid , 1992, Molecular microbiology.

[6]  Philip Hugenholtz,et al.  Viral tagging reveals discrete populations in Synechococcus viral genome sequence space , 2014, Nature.

[7]  P. Bork,et al.  Patterns and ecological drivers of ocean viral communities , 2015, Science.

[8]  R. Hendrix,et al.  Evolutionary relationships among diverse bacteriophages and prophages: all the world's a phage. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[9]  F. Rohwer,et al.  Explaining microbial population genomics through phage predation , 2009, Nature Reviews Microbiology.

[10]  M. Haynes,et al.  Scratching the Surface of Biology's Dark Matter , 2012 .

[11]  Brian C. Thomas,et al.  Fermentation, Hydrogen, and Sulfur Metabolism in Multiple Uncultivated Bacterial Phyla , 2012, Science.

[12]  M. Mitreva,et al.  Alpha-gliadin genes from the A, B, and D genomes of wheat contain different sets of celiac disease epitopes , 2006, BMC Genomics.

[13]  Robert A. Edwards,et al.  PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies , 2012, Nucleic acids research.

[14]  Se-Ran Jun,et al.  Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions , 2009, Proceedings of the National Academy of Sciences.

[15]  Rob Phillips,et al.  Probing Individual Environmental Bacteria for Viruses by Using Microfluidic Digital PCR , 2011, Science.

[16]  E. Koonin,et al.  Multiple Layers of Chimerism in a Single-Stranded DNA Virus Discovered by Deep Sequencing , 2015, Genome biology and evolution.

[17]  G. Mosig Recombination and recombination-dependent DNA replication in bacteriophage T4. , 1998, Annual review of genetics.

[18]  Martin J Blaser,et al.  Evidence of host-virus co-evolution in tetranucleotide usage patterns of bacteriophages and eukaryotic viruses , 2006, BMC Genomics.

[19]  Matthew B. Sullivan,et al.  The Pacific Ocean Virome (POV): A Marine Viral Metagenomic Dataset and Associated Protein Clusters for Quantitative Viral Ecology , 2013, PloS one.

[20]  K. Stedman,et al.  A novel virus genome discovered in an extreme environment suggests recombination between unrelated groups of RNA and DNA viruses , 2012, Biology Direct.

[21]  S. Casjens,et al.  Prophages and bacterial genomics: what have we learned so far? , 2003, Molecular microbiology.

[22]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[23]  Curt R. Fischer,et al.  The coexistence of Escherichia coli serotype O157:H7 and its specific bacteriophage in continuous culture. , 2004, FEMS microbiology letters.

[24]  Christina Backes,et al.  An integer linear programming approach for finding deregulated subgraphs in regulatory networks , 2011, Nucleic acids research.

[25]  Mathias Middelboe,et al.  Bacteriophages drive strain diversification in a marine Flavobacterium: implications for phage resistance and physiological properties. , 2009, Environmental microbiology.

[26]  P. Forterre,et al.  Chimeric viruses blur the borders between the major groups of eukaryotic single-stranded DNA viruses , 2013, Nature Communications.

[27]  P. Forterre,et al.  The major role of viruses in cellular evolution: facts and hypotheses. , 2013, Current opinion in virology.

[28]  D. Fouts Phage_Finder: Automated identification and classification of prophage regions in complete bacterial genome sequences , 2006, Nucleic acids research.

[29]  K. Turksen,et al.  Isolation and characterization , 2006 .

[30]  Gipsi Lima-Mendez,et al.  ACLAME: A CLAssification of Mobile genetic Elements, update 2010 , 2009, Nucleic Acids Res..

[31]  M. Russel,et al.  Filamentous bacteriophage: biology, phage display and nanotechnology applications. , 2011, Current issues in molecular biology.

[32]  Frederic D Bushman,et al.  Hypervariable loci in the human gut virome , 2012, Proceedings of the National Academy of Sciences.

[33]  S. Abedon Phage evolution and ecology. , 2009, Advances in applied microbiology.

[34]  Shiraz A. Shah,et al.  Metagenomic analyses of novel viruses and plasmids from a cultured environmental sample of hyperthermophilic neutrophiles. , 2010, Environmental microbiology.

[35]  P. Glaser,et al.  The LE1 Bacteriophage Replicates as a Plasmid within Leptospira biflexa: Construction of an L. biflexa-Escherichia coli Shuttle Vector , 2000, Journal of bacteriology.

[36]  Edward F. DeLong,et al.  The microbial ocean from genomes to biomes , 2009, Nature.

[37]  Chaochun Wei,et al.  NeSSM: A Next-Generation Sequencing Simulator for Metagenomics , 2013, PloS one.

[38]  J. Fuhrman,et al.  Beyond biogeographic patterns: processes shaping the microbial landscape , 2012, Nature Reviews Microbiology.

[39]  Brian C. Thomas,et al.  Metagenomic analysis of a high carbon dioxide subsurface microbial community populated by chemolithoautotrophs and bacteria and archaea from candidate phyla. , 2016, Environmental microbiology.

[40]  David S. Wishart,et al.  PHAST: A Fast Phage Search Tool , 2011, Nucleic Acids Res..

[41]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[42]  J. Heinemann,et al.  Isolation and characterization of bacteriophages infecting Salmonella spp. , 2006, FEMS microbiology letters.

[43]  M. Sullivan,et al.  Phylogenomics of T4 cyanophages: lateral gene transfer in the 'core' and origins of host genes. , 2012, Environmental microbiology.

[44]  Natalia N. Ivanova,et al.  Insights into the phylogeny and coding potential of microbial dark matter , 2013, Nature.

[45]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[46]  S. Valverde,et al.  Statistical structure of host–phage interactions , 2011, Proceedings of the National Academy of Sciences.

[47]  J. J. Morris,et al.  Mortality in the oceans: Causes and consequences , 2014 .

[48]  Alicia J. Shepard,et al.  Rapid diversification of coevolving marine Synechococcus and a virus , 2012, Proceedings of the National Academy of Sciences.

[49]  B. Griffin,et al.  The Impact of the Gut Microbiota on Drug Metabolism and Clinical Outcome , 2016, The Yale journal of biology and medicine.

[50]  J. Clemente,et al.  The Impact of the Gut Microbiota on Human Health: An Integrative View , 2012, Cell.

[51]  Matthew B. Sullivan,et al.  Rising to the challenge: accelerated pace of discovery transforms marine virology , 2015, Nature Reviews Microbiology.

[52]  Gipsi Lima-Mendez,et al.  Reticulate representation of evolutionary and functional relationships between phage genomes. , 2008, Molecular biology and evolution.

[53]  Kishori M. Konwar,et al.  Microbial ecology of expanding oxygen minimum zones , 2012, Nature Reviews Microbiology.

[54]  J. Bae,et al.  Diversity and Abundance of Single-Stranded DNA Viruses in Human Feces , 2011, Applied and Environmental Microbiology.

[55]  Matthew B. Sullivan,et al.  VirSorter: mining viral signal from microbial genomic data , 2015, PeerJ.

[56]  Shoshana J. Wodak,et al.  ACLAME: A CLAssification of Mobile genetic Elements , 2004, Nucleic Acids Res..

[57]  S. Hallam,et al.  Ecology and evolution of viruses infecting uncultivated SUP05 bacteria as revealed by single-cell- and meta-genomics , 2014, eLife.

[58]  B. Fane,et al.  Behind the chlamydial cloak: the replication cycle of chlamydiaphage Chp2, revealed. , 2008, Virology.

[59]  Mitchell J. Sullivan,et al.  Easyfig: a genome comparison visualizer , 2011, Bioinform..

[60]  Alessandra Carbone,et al.  Codon Bias is a Major Factor Explaining Phage Evolution in Translationally Biased Hosts , 2008, Journal of Molecular Evolution.

[61]  Rohit Ghai,et al.  Expanding the Marine Virosphere Using Metagenomics , 2013, PLoS genetics.

[62]  S. Valverde,et al.  Multi-scale structure and geographic drivers of cross-infection within marine bacteria and phages , 2012, The ISME Journal.

[63]  E. Delong,et al.  The Microbial Engines That Drive Earth's Biogeochemical Cycles , 2008, Science.

[64]  W. Sierralta,et al.  A new group of cosmopolitan bacteriophages induce a carrier state in the pandemic strain of Vibrio parahaemolyticus. , 2010, Environmental microbiology.

[65]  N. Kashtan,et al.  Single-Cell Genomics Reveals Hundreds of Coexisting Subpopulations in Wild Prochlorococcus , 2014, Science.

[66]  N. Sternberg,et al.  The maintenance of the P1 plasmid prophage. , 1981, Plasmid.

[67]  Forest Rohwer,et al.  Going viral: next-generation sequencing applied to phage populations in the human gut , 2012, Nature Reviews Microbiology.

[68]  Sergi Valverde,et al.  Phage-bacteria infection networks. , 2013, Trends in microbiology.

[69]  Kenneth H. Williams,et al.  Extraordinary phylogenetic diversity and metabolic versatility in aquifer sediment , 2013, Nature Communications.

[70]  Ricardo Cavicchioli,et al.  Advection shapes Southern Ocean microbial assemblages independent of distance and environment effects , 2013, Nature Communications.

[71]  R. Amann,et al.  Single-cell and population level viral infection dynamics revealed by phageFISH, a method to visualize intracellular and free viruses , 2013, Environmental microbiology.

[72]  Jody J. Wright,et al.  Diversity and population structure of Marine Group A bacteria in the Northeast subarctic Pacific Ocean , 2012, The ISME Journal.

[73]  S. Hallam,et al.  Metabolic reprogramming by viruses in the sunlit and dark ocean , 2013, Genome Biology.

[74]  Carl Kingsford,et al.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers , 2011, Bioinform..

[75]  E. Koonin,et al.  The ancient Virus World and evolution of cells , 2006, Biology Direct.

[76]  Curtis A Suttle,et al.  Previously unknown and highly divergent ssDNA viruses populate the oceans , 2013, The ISME Journal.

[77]  R. Stepanauskas,et al.  Single-Cell Genomics Reveals Organismal Interactions in Uncultivated Marine Protists , 2011, Science.

[78]  R. Edwards,et al.  The Phage Proteomic Tree: a Genome-Based Taxonomy for Phage , 2002, Journal of bacteriology.

[79]  W. Whitman,et al.  Prokaryotes: the unseen majority. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[80]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences: current status, policy and new initiatives , 2008, Nucleic Acids Res..

[81]  S. Duffy,et al.  Single-stranded genomic architecture constrains optimal codon usage , 2011, Bacteriophage.

[82]  C. Suttle Marine viruses — major players in the global ecosystem , 2007, Nature Reviews Microbiology.

[83]  Ghislain Fournous,et al.  The impact of prophages on bacterial chromosomes , 2004, Molecular microbiology.

[84]  Jacques van Helden,et al.  Prophinder: a computational tool for prophage prediction in prokaryotic genomes , 2008, Bioinform..

[85]  N. Pace A molecular view of microbial diversity and the biosphere. , 1997, Science.

[86]  Alexander Sczyrba,et al.  Single-cell genomics reveals complex carbohydrate degradation patterns in poribacterial symbionts of marine sponges , 2013, The ISME Journal.

[87]  S. Giovannoni,et al.  The uncultured microbial majority. , 2003, Annual review of microbiology.

[88]  L. Paulin,et al.  Related haloarchaeal pleomorphic viruses contain different genome types , 2012, Nucleic acids research.