Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample

The ongoing revolution in high-throughput sequencing continues to democratize the ability of small groups of investigators to map the microbial component of the biosphere. In particular, the coevolution of new sequencing platforms and new software tools allows data acquisition and analysis on an unprecedented scale. Here we report the next stage in this coevolutionary arms race, using the Illumina GAIIx platform to sequence a diverse array of 25 environmental samples and three known “mock communities” at a depth averaging 3.1 million reads per sample. We demonstrate excellent consistency in taxonomic recovery and recapture diversity patterns that were previously reported on the basis of metaanalysis of many studies from the literature (notably, the saline/nonsaline split in environmental samples and the split between host-associated and free-living communities). We also demonstrate that 2,000 Illumina single-end reads are sufficient to recapture the same relationships among samples that we observe with the full dataset. The results thus open up the possibility of conducting large-scale studies analyzing thousands of samples simultaneously to survey microbial communities at an unprecedented spatial and temporal resolution.

[1]  Edward Fredkin,et al.  Trie memory , 1960, Commun. ACM.

[2]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[3]  N. Pace A molecular view of microbial diversity and the biosphere. , 1997, Science.

[4]  K. Konstantinidis,et al.  Genomic insights that advance the species definition for prokaryotes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[5]  R. Knight,et al.  UniFrac: a New Phylogenetic Method for Comparing Microbial Communities , 2005, Applied and Environmental Microbiology.

[6]  C. Woese,et al.  The architecture of 5S rRNA and its relation to function , 1975, Journal of Molecular Evolution.

[7]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[8]  Susan M. Huse,et al.  Microbial diversity in the deep sea and the underexplored “rare biosphere” , 2006, Proceedings of the National Academy of Sciences.

[9]  R. B. Jackson,et al.  The diversity and biogeography of soil bacterial communities. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Susan M. Huse,et al.  Microbial Population Structures in the Deep Marine Biosphere , 2007, Science.

[11]  F. Bushman,et al.  Short pyrosequencing reads suffice for accurate microbial community analysis , 2007, Nucleic acids research.

[12]  R. Knight,et al.  Global patterns in bacterial diversity , 2007, Proceedings of the National Academy of Sciences.

[13]  Jonathan P. Bollback,et al.  The Use of Coded PCR Primers Enables High-Throughput Sequencing of Multiple Homolog Amplification Products by 454 Parallel Sequencing , 2007, PloS one.

[14]  R. Knight,et al.  Quantitative and Qualitative β Diversity Measures Lead to Different Insights into Factors That Structure Microbial Communities , 2007, Applied and Environmental Microbiology.

[15]  G. Casella,et al.  Pyrosequencing enumerates and contrasts soil microbial diversity , 2007, The ISME Journal.

[16]  J. Tiedje,et al.  Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy , 2007, Applied and Environmental Microbiology.

[17]  R. Knight,et al.  Evolution of Mammals and Their Gut Microbes , 2008, Science.

[18]  Philip Hugenholtz,et al.  A renaissance for the pioneering 16S rRNA gene. , 2008, Current opinion in microbiology.

[19]  R. Knight,et al.  Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex , 2008, Nature Methods.

[20]  R. Knight,et al.  Worlds within worlds: evolution of the vertebrate gut microbiota , 2008, Nature Reviews Microbiology.

[21]  C. Deming,et al.  Topographical and Temporal Diversity of the Human Skin Microbiome , 2009, Science.

[22]  R. Knight,et al.  Pyrosequencing-Based Assessment of Soil pH as a Predictor of Soil Bacterial Community Structure at the Continental Scale , 2009, Applied and Environmental Microbiology.

[23]  Susan M. Huse,et al.  Metagenomic study of the oral microbiota by Illumina high-throughput sequencing. , 2009, Journal of microbiological methods.

[24]  R. Knight,et al.  Microbial community profiling for human microbiome projects: Tools, techniques, and challenges. , 2009, Genome research.

[25]  Adam P. Arkin,et al.  FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix , 2009, Molecular biology and evolution.

[26]  R. Knight,et al.  Bacterial Community Variation in Human Body Habitats Across Space and Time , 2009, Science.

[27]  Dmitry Pushkarev,et al.  Single-molecule sequencing of an individual human genome , 2009, Nature Biotechnology.

[28]  C. Quince,et al.  Accurate determination of microbial diversity from 454 pyrosequencing data , 2009, Nature Methods.

[29]  B. Roe,et al.  A core gut microbiome in obese and lean twins , 2008, Nature.

[30]  E. Casamayor,et al.  Global ecological patterns in uncultured Archaea , 2010, The ISME Journal.

[31]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[32]  Bernard Henrissat,et al.  Organismal, genetic, and transcriptional variation in the deeply sequenced gut microbiomes of identical twins , 2010, Proceedings of the National Academy of Sciences.

[33]  R. Knight,et al.  Fast UniFrac: Facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data , 2009, The ISME Journal.

[34]  V. Kunin,et al.  Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. , 2009, Environmental microbiology.