Microbial Community Composition and Diversity via 16S rRNA Gene Amplicons: Evaluating the Illumina Platform

As new sequencing technologies become cheaper and older ones disappear, laboratories switch vendors and platforms. Validating the new setups is a crucial part of conducting rigorous scientific research. Here we report on the reliability and biases of performing bacterial 16S rRNA gene amplicon paired-end sequencing on the MiSeq Illumina platform. We designed a protocol using 50 barcode pairs to run samples in parallel and coded a pipeline to process the data. Sequencing the same sediment sample in 248 replicates as well as 70 samples from alkaline soda lakes, we evaluated the performance of the method with regards to estimates of alpha and beta diversity. Using different purification and DNA quantification procedures we always found up to 5-fold differences in the yield of sequences between individually barcodes samples. Using either a one-step or a two-step PCR preparation resulted in significantly different estimates in both alpha and beta diversity. Comparing with a previous method based on 454 pyrosequencing, we found that our Illumina protocol performed in a similar manner – with the exception for evenness estimates where correspondence between the methods was low. We further quantified the data loss at every processing step eventually accumulating to 50% of the raw reads. When evaluating different OTU clustering methods, we observed a stark contrast between the results of QIIME with default settings and the more recent UPARSE algorithm when it comes to the number of OTUs generated. Still, overall trends in alpha and beta diversity corresponded highly using both clustering methods. Our procedure performed well considering the precisions of alpha and beta diversity estimates, with insignificant effects of individual barcodes. Comparative analyses suggest that 454 and Illumina sequence data can be combined if the same PCR protocol and bioinformatic workflows are used for describing patterns in richness, beta-diversity and taxonomic composition.

[1]  R. Knight,et al.  Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex , 2008, Nature Methods.

[2]  Pelin Yilmaz,et al.  The SILVA ribosomal RNA gene database project: improved data processing and web-based tools , 2012, Nucleic Acids Res..

[3]  Sarah L. Westcott,et al.  Development of a Dual-Index Sequencing Strategy and Curation Pipeline for Analyzing Amplicon Sequence Data on the MiSeq Illumina Sequencing Platform , 2013, Applied and Environmental Microbiology.

[4]  Anders F. Andersson,et al.  Comparative Analysis of Human Gut Microbiota by Barcoded Pyrosequencing , 2008, PloS one.

[5]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[6]  Charles T. Garten,et al.  Spatial scaling of functional gene diversity across various microbial taxa , 2008, Proceedings of the National Academy of Sciences.

[7]  E. Triplett,et al.  Automated Approach for Ribosomal Intergenic Spacer Analysis of Microbial Diversity and Its Application to Freshwater Bacterial Communities , 1999, Applied and Environmental Microbiology.

[8]  S. Bertilsson,et al.  Diversity and abundance of aromatic catabolic genes in lake sediments in response to temperature change. , 2014, FEMS microbiology ecology.

[9]  Hans H. Cheng,et al.  Characterization of microbial diversity by determining terminal restriction fragment length polymorphisms of genes encoding 16S rRNA , 1997, Applied and environmental microbiology.

[10]  Stefan Bertilsson,et al.  Coherent dynamics and association networks among lake bacterioplankton taxa , 2011, The ISME Journal.

[11]  Anders F. Andersson,et al.  Transitions in bacterial communities along the 2000 km salinity gradient of the Baltic Sea , 2011, The ISME Journal.

[12]  E. Casamayor,et al.  Ecology of the rare microbial biosphere of the Arctic Ocean , 2009, Proceedings of the National Academy of Sciences.

[13]  Rob Knight,et al.  The 'rare biosphere': a reality check , 2009, Nature Methods.

[14]  Daniel G. Brown,et al.  PANDAseq: paired-end assembler for illumina sequences , 2012, BMC Bioinformatics.

[15]  A. Eiler,et al.  Unveiling Distribution Patterns of Freshwater Phytoplankton by a Next Generation Sequencing Based Approach , 2013, PloS one.

[16]  L. Tedersoo,et al.  454 Pyrosequencing and Sanger sequencing of tropical mycorrhizal fungi provide similar results but reveal substantial methodological biases. , 2010, The New phytologist.

[17]  Susan M. Huse,et al.  Ironing out the wrinkles in the rare biosphere through improved OTU clustering , 2010, Environmental microbiology.

[18]  Fei Zou,et al.  BIPES, a cost-effective high-throughput method for assessing microbial diversity , 2011, The ISME Journal.

[19]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[20]  Frédéric Mahé,et al.  Swarm: robust and fast clustering method for amplicon-based studies , 2014, PeerJ.

[21]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[22]  Marcus J. Claesson,et al.  Comparison of two next-generation sequencing technologies for resolving highly complex microbiota composition using tandem variable 16S rRNA gene regions , 2010, Nucleic acids research.

[23]  Robert C. Edgar,et al.  UPARSE: highly accurate OTU sequences from microbial amplicon reads , 2013, Nature Methods.

[24]  Jean M. Macklaim,et al.  Microbiome Profiling by Illumina Sequencing of Combinatorial Sequence-Tagged PCR Products , 2010, PLoS ONE.

[25]  F. Chen,et al.  Experimental factors affecting PCR-based estimates of microbial species richness and evenness , 2010, The ISME Journal.

[26]  John C. Wooley,et al.  Ultrafast clustering algorithms for metagenomic sequence analysis , 2012, Briefings Bioinform..

[27]  A. Eiler,et al.  Distinct and diverse anaerobic bacterial communities in boreal lakes dominated by candidate division OD1 , 2012, The ISME Journal.

[28]  Daniel H. Huson,et al.  CREST – Classification Resources for Environmental Sequence Tags , 2012, PloS one.

[29]  Rob Knight,et al.  UCHIME improves sensitivity and speed of chimera detection , 2011, Bioinform..

[30]  R. Knight,et al.  PyCogent: a toolkit for making sense from sequence , 2007, Genome Biology.

[31]  H. Ochman,et al.  Illumina-based analysis of microbial community diversity , 2011, The ISME Journal.

[32]  R. Knight,et al.  UniFrac: a New Phylogenetic Method for Comparing Microbial Communities , 2005, Applied and Environmental Microbiology.

[33]  N. Pace,et al.  Microbial ecology and evolution: a ribosomal RNA approach. , 1986, Annual review of microbiology.

[34]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[35]  T. Dallman,et al.  Performance comparison of benchtop high-throughput sequencing platforms , 2012, Nature Biotechnology.

[36]  William A. Walters,et al.  Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms , 2012, The ISME Journal.

[37]  张静,et al.  Banana Ovate family protein MaOFP1 and MADS-box protein MuMADS1 antagonistically regulated banana fruit ripening , 2015 .

[38]  W. Sloan,et al.  What is the extent of prokaryotic diversity? , 2006, Philosophical Transactions of the Royal Society B: Biological Sciences.

[39]  Anders F. Andersson,et al.  Systematic Design of 18S rRNA Gene Primers for Determining Eukaryotic Diversity in Microbial Consortia , 2014, PloS one.

[40]  Susan M. Huse,et al.  Microbial diversity in the deep sea and the underexplored “rare biosphere” , 2006, Proceedings of the National Academy of Sciences.

[41]  Anders F. Andersson,et al.  Pyrosequencing reveals contrasting seasonal dynamics of taxa within Baltic Sea bacterioplankton communities , 2010, The ISME Journal.

[42]  C. Quince,et al.  Accurate determination of microbial diversity from 454 pyrosequencing data , 2009, Nature Methods.

[43]  Emily R. Davenport,et al.  Taxonomic Classification of Bacterial 16S rRNA Genes Using Short Sequencing Reads: Evaluation of Effective Study Designs , 2013, PloS one.

[44]  Andrea K. Bartram,et al.  Generation of Multimillion-Sequence 16S rRNA Gene Libraries from Complex Microbial Communities by Assembling Paired-End Illumina Reads , 2011, Applied and Environmental Microbiology.

[45]  J. Tiedje,et al.  Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy , 2007, Applied and Environmental Microbiology.