Evaluating Bias of Illumina-Based Bacterial 16S rRNA Gene Profiles

ABSTRACT Massively parallel sequencing of 16S rRNA genes enables the comparison of terrestrial, aquatic, and host-associated microbial communities with sufficient sequencing depth for robust assessments of both alpha and beta diversity. Establishing standardized protocols for the analysis of microbial communities is dependent on increasing the reproducibility of PCR-based molecular surveys by minimizing sources of methodological bias. In this study, we tested the effects of template concentration, pooling of PCR amplicons, and sample preparation/interlane sequencing on the reproducibility associated with paired-end Illumina sequencing of bacterial 16S rRNA genes. Using DNA extracts from soil and fecal samples as templates, we sequenced pooled amplicons and individual reactions for both high (5- to 10-ng) and low (0.1-ng) template concentrations. In addition, all experimental manipulations were repeated on two separate days and sequenced on two different Illumina MiSeq lanes. Although within-sample sequence profiles were highly consistent, template concentration had a significant impact on sample profile variability for most samples. Pooling of multiple PCR amplicons, sample preparation, and interlane variability did not influence sample sequence data significantly. This systematic analysis underlines the importance of optimizing template concentration in order to minimize variability in microbial-community surveys and indicates that the practice of pooling multiple PCR amplicons prior to sequencing contributes proportionally less to reducing bias in 16S rRNA gene surveys with next-generation sequencing.

[1]  G. Soulas,et al.  DNA Extraction from Soils: Old Bias for New Microbial Diversity Analysis Methods , 2001, Applied and Environmental Microbiology.

[2]  J. Gilbert,et al.  Investigating the Impact of Storage Conditions on Microbial Community Composition in Soil Samples , 2013, PloS one.

[3]  L. Raskin,et al.  PCR Biases Distort Bacterial and Archaeal Community Structure in Pyrosequencing Datasets , 2012, PloS one.

[4]  Daniel G. Brown,et al.  PANDAseq: paired-end assembler for illumina sequences , 2012, BMC Bioinformatics.

[5]  Fernando Azpiroz,et al.  Storage conditions of intestinal microbiota matter in metagenomic analysis , 2012, BMC Microbiology.

[6]  Howard C. Tenenbaum,et al.  Bacterial biogeography of the human digestive tract , 2011, Scientific reports.

[7]  S. Giovannoni,et al.  Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR , 1996, Applied and environmental microbiology.

[8]  Martin F. Polz,et al.  Bias in Template-to-Product Ratios in Multitemplate PCR , 1998, Applied and Environmental Microbiology.

[9]  J. Gilbert,et al.  Metagenomes and metatranscriptomes from the L4 long-term coastal monitoring station in the Western English Channel , 2010, Standards in genomic sciences.

[10]  Lu Wang,et al.  The NIH Human Microbiome Project. , 2009, Genome research.

[11]  J. W. Pendleton,et al.  Surveys of Gene Families Using Polymerase Chain Reaction: PCR Selection and PCR Drift , 1994 .

[12]  Andrea K. Bartram,et al.  Generation of Multimillion-Sequence 16S rRNA Gene Libraries from Complex Microbial Communities by Assembling Paired-End Illumina Reads , 2011, Applied and Environmental Microbiology.

[13]  M. Wagner,et al.  Barcoded Primers Used in Multiplex Amplicon Pyrosequencing Bias Amplification , 2011, Applied and Environmental Microbiology.

[14]  Matthew C. Thomas,et al.  Molecular methods to measure intestinal bacteria: a review. , 2012, Journal of AOAC International.

[15]  Shahar Alon,et al.  Barcoding bias in high-throughput multiplex sequencing of miRNA. , 2011, Genome research.

[16]  Michael W. Hall,et al.  AXIOME: automated exploration of microbial diversity , 2013, GigaScience.

[17]  F. Brockman,et al.  Effect of PCR template concentration on the composition and distribution of total community 16S rDNA clone libraries , 1997, Molecular ecology.

[18]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[19]  Marti J. Anderson,et al.  Multivariate dispersion as a measure of beta diversity. , 2006, Ecology letters.