Estimation of viral richness from shotgun metagenomes using a frequency count approach

BackgroundViruses are important drivers of ecosystem functions, yet little is known about the vast majority of viruses. Viral shotgun metagenomics enables the investigation of broad ecological questions in phage communities. One ecological characteristic is species richness, which is the number of different species in a community. Viruses do not have a phylogenetic marker analogous to the bacterial 16S rRNA gene with which to estimate richness, and so contig spectra are employed to measure the number of virus taxa in a given community. A contig spectrum is generated from a viral shotgun metagenome by assembling the random sequence reads into groups of sequences that overlap (contigs) and counting the number of sequences that group within each contig. Current tools available to analyze contig spectra to estimate phage richness are limited by relying on rank-abundance data.ResultsWe present statistical estimates of virus richness from contig spectra. The program CatchAll (http://www.northeastern.edu/catchall/) was used to analyze contig spectra in terms of frequency count data rather than rank-abundance, thus enabling formal statistical analyses. Also, the influence of potentially spurious low-frequency counts on richness estimates was minimized by two methods, empirical and statistical. The results show greater estimates of viral richness than previous calculations in nearly all environments analyzed, including swine feces and reclaimed fresh water.ConclusionsCatchAll yielded consistent estimates of richness across viral metagenomes from the same or similar environments. Additionally, analysis of pooled viral metagenomes from different environments via mixed contig spectra resulted in greater richness estimates than those of the component metagenomes. Using CatchAll to analyze contig spectra will improve estimations of richness from viral shotgun metagenomes, particularly from large datasets, by providing statistical measures of richness.

[1]  Itai Sharon,et al.  Comparative metagenomics of microbial traits within oceanic viral communities , 2011, The ISME Journal.

[2]  S. Casjens,et al.  Evolution of mosaically related tailed bacteriophage genomes seen through the lens of phage P22 virion assembly. , 2011, Virology.

[3]  Florent E. Angly,et al.  Viral diversity and dynamics in an infant gut. , 2008, Research in microbiology.

[4]  K. Wommack,et al.  Virioplankton: Viruses in Aquatic Ecosystems , 2000, Microbiology and Molecular Biology Reviews.

[5]  Heather K. Allen,et al.  Antibiotics in Feed Induce Prophages in Swine Fecal Microbiomes , 2011, mBio.

[6]  Florent E. Angly,et al.  The Marine Viromes of Four Oceanic Regions , 2006, PLoS biology.

[7]  Tracy K. Teal,et al.  Systematic artifacts in metagenomes from complex microbial communities , 2009, The ISME Journal.

[8]  John Bunge,et al.  Estimating the Number of Species with Catchall , 2011, Pacific Symposium on Biocomputing.

[9]  P. Salamon,et al.  Metagenomic Analyses of an Uncultured Viral Community from Human Feces , 2003, Journal of bacteriology.

[10]  J. Tiedje,et al.  Biogeography: An Emerging Cornerstone for Understanding Prokaryotic Diversity, Ecology, and Evolution , 2007, Microbial Ecology.

[11]  Florent E. Angly,et al.  Next Generation Sequence Assembly with AMOS , 2011, Current protocols in bioinformatics.

[12]  James A. Foster,et al.  Estimating population diversity with CatchAll , 2012, Bioinform..

[13]  Mihai Pop,et al.  Minimus: a fast, lightweight genome assembler , 2007, BMC Bioinformatics.

[14]  V. Kunin,et al.  Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. , 2009, Environmental microbiology.

[15]  James A. Foster,et al.  Estimating Population Diversity with Unreliable Low Frequency Counts , 2012, Pacific Symposium on Biocomputing.

[16]  Peter Salamon,et al.  PHACCS, an online tool for estimating the structure and diversity of uncultured viral communities using metagenomic information , 2005, BMC Bioinformatics.

[17]  R. V. Thurber Current insights into phage biodiversity and biogeography. , 2009, Current opinion in microbiology.

[18]  B. Levin,et al.  Modeling the role of bacteriophage in the control of cholera outbreaks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[19]  J. Glenn Morris,et al.  Cholera transmission: the host, pathogen and bacteriophage dynamic , 2009, Nature Reviews Microbiology.

[20]  D. Antonopoulos,et al.  Using the metagenomics RAST server (MG-RAST) for analyzing shotgun metagenomes. , 2010, Cold Spring Harbor protocols.

[21]  S. Kravitz,et al.  CAMERA: A Community Resource for Metagenomics , 2007, PLoS biology.

[22]  Yan Wei Lim,et al.  Metagenomic analysis of viruses in reclaimed water. , 2009, Environmental microbiology.

[23]  Forest Rohwer,et al.  Metagenomic Analysis of Respiratory Tract DNA Viral Communities in Cystic Fibrosis and Non-Cystic Fibrosis Individuals , 2009, PloS one.

[24]  B. Andresen,et al.  Genomic analysis of uncultured marine viral communities , 2002, Proceedings of the National Academy of Sciences of the United States of America.