Microbial community composition and diversity via 16S rRNA gene amplicons: evaluating the Illumina platform

As new sequencing technologies become cheaper and older ones disappear, laboratories switch vendors and platforms. Validating the new setups is a crucial part of conducting rigorous scientific research. Here we report on the reliability and biases of performing bacterial 16S rRNA gene amplicon paired-end sequencing on the MiSeq Illumina platform. We designed a protocol using 50 barcode pairs to run samples in parallel and coded a pipeline to process the data. Sequencing the same sediment sample in 248 replicates as well as 70 samples from alkaline soda lakes, we evaluated the performance of the method with regards to estimates of alpha and beta diversity. Using different purification and DNA quantification procedures we always found up to 5-fold differences in the yield of sequences between individually barcodes samples. Using either a one-step or a two-step PCR preparation resulted in significantly different estimates in both alpha and beta diversity. Comparing with a previous method based on 454 pyrosequencing, we found that our Illumina protocol performed in a similar manner – with the exception for evenness estimates where correspondence between the methods was low. We further quantified the data loss at every processing step eventually accumulating to 50% of the raw reads. When evaluating different OTU clustering methods, we observed a stark contrast between the results of QIIME with default settings and the more recent UPARSE algorithm when it comes to the number of OTUs generated. Still, overall trends in alpha and beta diversity corresponded highly using both clustering methods. Our procedure performed well considering the precisions of alpha and beta diversity estimates, with insignificant effects of individual barcodes. Comparative analyses suggest that 454 and Illumina sequence data can be combined if the same PCR protocol and bioinformatic workflows are used for describing patterns in richness, beta-diversity and taxonomic composition.

[1]  Pelin Yilmaz,et al.  The SILVA ribosomal RNA gene database project: improved data processing and web-based tools , 2012, Nucleic Acids Res..

[2]  Sarah L. Westcott,et al.  Development of a Dual-Index Sequencing Strategy and Curation Pipeline for Analyzing Amplicon Sequence Data on the MiSeq Illumina Sequencing Platform , 2013, Applied and Environmental Microbiology.

[3]  Daniel G. Brown,et al.  PANDAseq: paired-end assembler for illumina sequences , 2012, BMC Bioinformatics.

[4]  N. Pace,et al.  Microbial ecology and evolution: a ribosomal RNA approach. , 1986, Annual review of microbiology.

[5]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[6]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[7]  Anders F. Andersson,et al.  Comparative Analysis of Human Gut Microbiota by Barcoded Pyrosequencing , 2008, PloS one.

[8]  Marcus J. Claesson,et al.  Comparison of two next-generation sequencing technologies for resolving highly complex microbiota composition using tandem variable 16S rRNA gene regions , 2010, Nucleic acids research.

[9]  R. Knight,et al.  Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex , 2008, Nature Methods.

[10]  Robert C. Edgar,et al.  UPARSE: highly accurate OTU sequences from microbial amplicon reads , 2013, Nature Methods.

[11]  A. Eiler,et al.  Unveiling Distribution Patterns of Freshwater Phytoplankton by a Next Generation Sequencing Based Approach , 2013, PloS one.

[12]  Daniel H. Huson,et al.  CREST – Classification Resources for Environmental Sequence Tags , 2012, PloS one.

[13]  Susan M. Huse,et al.  Ironing out the wrinkles in the rare biosphere through improved OTU clustering , 2010, Environmental microbiology.

[14]  Fei Zou,et al.  BIPES, a cost-effective high-throughput method for assessing microbial diversity , 2011, The ISME Journal.

[15]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[16]  A. Eiler,et al.  Distinct and diverse anaerobic bacterial communities in boreal lakes dominated by candidate division OD1 , 2012, The ISME Journal.

[17]  C. Quince,et al.  Accurate determination of microbial diversity from 454 pyrosequencing data , 2009, Nature Methods.

[18]  E. Casamayor,et al.  Ecology of the rare microbial biosphere of the Arctic Ocean , 2009, Proceedings of the National Academy of Sciences.

[19]  Rob Knight,et al.  The 'rare biosphere': a reality check , 2009, Nature Methods.

[20]  William A. Walters,et al.  Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms , 2012, The ISME Journal.

[21]  Rob Knight,et al.  UCHIME improves sensitivity and speed of chimera detection , 2011, Bioinform..

[22]  R. Knight,et al.  UniFrac: a New Phylogenetic Method for Comparing Microbial Communities , 2005, Applied and Environmental Microbiology.

[23]  E. Triplett,et al.  Automated Approach for Ribosomal Intergenic Spacer Analysis of Microbial Diversity and Its Application to Freshwater Bacterial Communities , 1999, Applied and Environmental Microbiology.

[24]  R. Knight,et al.  PyCogent: a toolkit for making sense from sequence , 2007, Genome Biology.

[25]  H. Ochman,et al.  Illumina-based analysis of microbial community diversity , 2011, The ISME Journal.

[26]  S. Bertilsson,et al.  Diversity and abundance of aromatic catabolic genes in lake sediments in response to temperature change. , 2014, FEMS microbiology ecology.

[27]  Hans H. Cheng,et al.  Characterization of microbial diversity by determining terminal restriction fragment length polymorphisms of genes encoding 16S rRNA , 1997, Applied and environmental microbiology.

[28]  L. Tedersoo,et al.  454 Pyrosequencing and Sanger sequencing of tropical mycorrhizal fungi provide similar results but reveal substantial methodological biases. , 2010, The New phytologist.

[29]  Charles T. Garten,et al.  Spatial scaling of functional gene diversity across various microbial taxa , 2008, Proceedings of the National Academy of Sciences.

[30]  Susan M. Huse,et al.  Microbial diversity in the deep sea and the underexplored “rare biosphere” , 2006, Proceedings of the National Academy of Sciences.

[31]  Anders F. Andersson,et al.  Pyrosequencing reveals contrasting seasonal dynamics of taxa within Baltic Sea bacterioplankton communities , 2010, The ISME Journal.

[32]  Andrea K. Bartram,et al.  Generation of Multimillion-Sequence 16S rRNA Gene Libraries from Complex Microbial Communities by Assembling Paired-End Illumina Reads , 2011, Applied and Environmental Microbiology.

[33]  J. Tiedje,et al.  Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy , 2007, Applied and Environmental Microbiology.

[34]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[35]  W. Sloan,et al.  What is the extent of prokaryotic diversity? , 2006, Philosophical Transactions of the Royal Society B: Biological Sciences.

[36]  John C. Wooley,et al.  Ultrafast clustering algorithms for metagenomic sequence analysis , 2012, Briefings Bioinform..

[37]  T. Dallman,et al.  Performance comparison of benchtop high-throughput sequencing platforms , 2012, Nature Biotechnology.

[38]  Emily R. Davenport,et al.  Taxonomic Classification of Bacterial 16S rRNA Genes Using Short Sequencing Reads: Evaluation of Effective Study Designs , 2013, PloS one.

[39]  Stefan Bertilsson,et al.  Coherent dynamics and association networks among lake bacterioplankton taxa , 2011, The ISME Journal.

[40]  Anders F. Andersson,et al.  Transitions in bacterial communities along the 2000 km salinity gradient of the Baltic Sea , 2011, The ISME Journal.

[41]  Jean M. Macklaim,et al.  Microbiome Profiling by Illumina Sequencing of Combinatorial Sequence-Tagged PCR Products , 2010, PLoS ONE.

[42]  F. Chen,et al.  Experimental factors affecting PCR-based estimates of microbial species richness and evenness , 2010, The ISME Journal.