Evaluation of the reproducibility of amplicon sequencing with Illumina MiSeq platform

Illumina’s MiSeq has become the dominant platform for gene amplicon sequencing in microbial ecology studies; however, various technical concerns, such as reproducibility, still exist. To assess reproducibility, 16S rRNA gene amplicons from 18 soil samples of a reciprocal transplantation experiment were sequenced on an Illumina MiSeq. The V4 region of 16S rRNA gene from each sample was sequenced in triplicate with each replicate having a unique barcode. The average OTU overlap, without considering sequence abundance, at a rarefaction level of 10,323 sequences was 33.4±2.1% and 20.2±1.7% between two and among three technical replicates, respectively. When OTU sequence abundance was considered, the average sequence abundance weighted OTU overlap was 85.6±1.6% and 81.2±2.1% for two and three replicates, respectively. Removing singletons significantly increased the overlap for both (~1–3%, p<0.001). Increasing the sequencing depth to 160,000 reads by deep sequencing increased OTU overlap both when sequence abundance was considered (95%) and when not (44%). However, if singletons were not removed the overlap between two technical replicates (not considering sequence abundance) plateaus at 39% with 30,000 sequences. Diversity measures were not affected by the low overlap as α-diversities were similar among technical replicates while β-diversities (Bray-Curtis) were much smaller among technical replicates than among treatment replicates (e.g., 0.269 vs. 0.374). Higher diversity coverage, but lower OTU overlap, was observed when replicates were sequenced in separate runs. Detrended correspondence analysis indicated that while there was considerable variation among technical replicates, the reproducibility was sufficient for detecting treatment effects for the samples examined. These results suggest that although there is variation among technical replicates, amplicon sequencing on MiSeq is useful for analyzing microbial community structure if used appropriately and with caution. For example, including technical replicates, removing spurious sequences and unrepresentative OTUs, using a clustering method with a high stringency for OTU generation, estimating treatment effects at higher taxonomic levels, and adapting the unique molecular identifier (UMI) and other newly developed methods to lower PCR and sequencing error and to identify true low abundance rare species all can increase reproducibility.

[1]  Jizhong Zhou,et al.  Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities , 2013, mBio.

[2]  Dominique Türkowsky,et al.  Crop monoculture rather than agriculture reduces the spatial turnover of soil bacterial communities at a regional scale. , 2015, Environmental microbiology.

[3]  F. Not,et al.  Intracellular Diversity of the V4 and V9 Regions of the 18S rRNA in Marine Protists (Radiolarians) Assessed by High-Throughput Sequencing , 2014, PloS one.

[4]  F. Not,et al.  Intracellular Diversity of the V 4 and V 9 Regions of the 18 S rRNA in Marine Protists ( Radiolarians ) Assessed by High-Throughput Sequencing , 2014 .

[5]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[6]  Michael W. Hall,et al.  Evaluating Bias of Illumina-Based Bacterial 16S rRNA Gene Profiles , 2014, Applied and Environmental Microbiology.

[7]  William A. Walters,et al.  Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample , 2010, Proceedings of the National Academy of Sciences.

[8]  Shahar Alon,et al.  Barcoding bias in high-throughput multiplex sequencing of miRNA. , 2011, Genome research.

[9]  Frédéric J. J. Chain,et al.  Reproducibility of pyrosequencing data for biodiversity assessment in complex communities , 2014 .

[10]  W. Sloan,et al.  Exploring Microbial Diversity--A Vast Below , 2005, Science.

[11]  J. Pawlowski,et al.  Patchiness of deep-sea benthic Foraminifera across the Southern Ocean: Insights from high-throughput DNA sequencing , 2014 .

[12]  J. Schimel,et al.  Analysis of Run-to-Run Variation of Bar-Coded Pyrosequencing for Evaluating Bacterial Community Shifts and Individual Taxa Dynamics , 2014, PloS one.

[13]  Pedro Olivares-Chauvet,et al.  UMI-4C for quantitative and targeted chromosomal contact profiling , 2016, Nature Methods.

[14]  Jizhong Zhou,et al.  Long-term soil transplant simulating climate change with latitude significantly alters microbial temporal turnover , 2015, The ISME Journal.

[15]  M. Wagner,et al.  Barcoded Primers Used in Multiplex Amplicon Pyrosequencing Bias Amplification , 2011, Applied and Environmental Microbiology.

[16]  H. Ochman,et al.  Illumina-based analysis of microbial community diversity , 2011, The ISME Journal.

[17]  A. Chao Estimating the population size for capture-recapture data with unequal catchability. , 1987, Biometrics.

[18]  J. Tiedje,et al.  Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy , 2007, Applied and Environmental Microbiology.

[19]  Danilo Ercolini,et al.  The microbiota of high-moisture mozzarella cheese produced with different acidification methods. , 2016, International journal of food microbiology.

[20]  S. Linnarsson,et al.  Counting absolute numbers of molecules using unique molecular identifiers , 2011, Nature Methods.

[21]  J. Tiedje,et al.  DNA recovery from soils of diverse composition , 1996, Applied and environmental microbiology.

[22]  Steven Salzberg,et al.  BIOINFORMATICS ORIGINAL PAPER , 2004 .

[23]  Jullien M. Flynn,et al.  Toward accurate molecular identification of species in complex environmental samples: testing the performance of sequence filtering and clustering methods , 2015, Ecology and evolution.

[24]  T. Levin Bugs, Stool, and the Irritable Bowel Syndrome: Too Much Is as Bad as Too Little? , 2011 .

[25]  H. MacIsaac,et al.  Rare biosphere exploration using high-throughput sequencing: research progress and perspectives , 2015, Conservation Genetics.

[26]  Kabir G. Peay,et al.  Sequence Depth, Not PCR Replication, Improves Ecological Inference from Next Generation DNA Sequencing , 2014, PloS one.

[27]  Rob Knight,et al.  UCHIME improves sensitivity and speed of chimera detection , 2011, Bioinform..

[28]  N. Kyrpides,et al.  Direct Comparisons of Illumina vs. Roche 454 Sequencing Technologies on the Same Microbial Community DNA Sample , 2012, PloS one.

[29]  Jizhong Zhou,et al.  The interactive effects of soil transplant into colder regions and cropping on soil microbiology and biogeochemistry. , 2015, Environmental microbiology.

[30]  Susan M. Huse,et al.  Microbial Population Structures in the Deep Marine Biosphere , 2007, Science.

[31]  Ye Deng,et al.  Phasing amplicon sequencing on Illumina Miseq for robust environmental microbial community analysis , 2015, BMC Microbiology.

[32]  Jizhong Zhou,et al.  Soil genomics , 2009, Nature Reviews Microbiology.

[33]  Jeffrey A. Hussmann,et al.  High-throughput DNA sequencing errors are reduced by orders of magnitude using circle sequencing , 2013, Proceedings of the National Academy of Sciences.

[34]  L. Roesch,et al.  Low sequencing efforts bias analyses of shared taxa in microbial communities , 2012, Folia Microbiologica.

[35]  José Costa,et al.  PicoGreen quantitation of DNA: effective evaluation of samples pre- or post-PCR , 1996, Nucleic Acids Res..

[36]  Jizhong Zhou,et al.  Planting increases the abundance and structure complexity of soil core functional genes relevant to carbon and nitrogen cycling , 2015, Scientific Reports.

[37]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[38]  Philippe Esling,et al.  Accurate multiplexing and filtering for high-throughput amplicon-sequencing , 2015, Nucleic acids research.

[39]  Gerhard G. Thallinger,et al.  Alterations in the Colonic Microbiota in Response to Osmotic Diarrhea , 2013, PloS one.

[40]  Niels W. Hanson,et al.  Rare taxa have potential to make metabolic contributions in enhanced biological phosphorus removal ecosystems. , 2015, Environmental microbiology.

[41]  Jussi Taipale,et al.  Counting absolute number of molecules using unique molecular identifiers , 2011 .

[42]  William A. Walters,et al.  Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms , 2012, The ISME Journal.

[43]  Gioele La Manno,et al.  Quantitative single-cell RNA-seq with unique molecular identifiers , 2013, Nature Methods.

[44]  Marcus J. Claesson,et al.  Comparison of two next-generation sequencing technologies for resolving highly complex microbiota composition using tandem variable 16S rRNA gene regions , 2010, Nucleic acids research.

[45]  M. Horn,et al.  Actinobacterial Nitrate Reducers and Proteobacterial Denitrifiers Are Abundant in N2O-Metabolizing Palsa Peat , 2012, Applied and Environmental Microbiology.

[46]  P. Mieczkowski,et al.  Practical innovations for high-throughput amplicon sequencing , 2013, Nature Methods.

[47]  Robert C. Edgar,et al.  UPARSE: highly accurate OTU sequences from microbial amplicon reads , 2013, Nature Methods.

[48]  K. Schleifer,et al.  Phylogenetic identification and in situ detection of individual microbial cells without cultivation. , 1995, Microbiological reviews.

[49]  D. B. Duncan MULTIPLE RANGE AND MULTIPLE F TESTS , 1955 .

[50]  J. Gong,et al.  Depth shapes α- and β-diversities of microbial eukaryotes in surficial sediments of coastal ecosystems. , 2015, Environmental Microbiology.

[51]  Masayuki Ushio,et al.  High-throughput sequencing shows inconsistent results with a microscope-based analysis of the soil prokaryotic community , 2014 .

[52]  Marti J. Anderson,et al.  Distance‐Based Tests for Homogeneity of Multivariate Dispersions , 2006, Biometrics.

[53]  T. Poisot,et al.  High-Throughput Sequencing: A Roadmap Toward Community Ecology , 2013, Ecology and evolution.

[54]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[55]  V. Kunin,et al.  Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. , 2009, Environmental microbiology.

[56]  R. Knight,et al.  Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex , 2008, Nature Methods.

[57]  K. Walsh,et al.  Using ecological diversity measures with bacterial communities. , 2003, FEMS microbiology ecology.

[58]  M. Hill,et al.  Detrended correspondence analysis: An improved ordination technique , 2004, Vegetatio.

[59]  Eoin L. Brodie,et al.  Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB , 2006, Applied and Environmental Microbiology.

[60]  Marti J. Anderson,et al.  Multivariate dispersion as a measure of beta diversity. , 2006, Ecology letters.

[61]  Stephen J. Salipante,et al.  Performance Comparison of Illumina and Ion Torrent Next-Generation Sequencing Platforms for 16S rRNA-Based Bacterial Community Profiling , 2014, Applied and Environmental Microbiology.

[62]  Lucas Sinclair,et al.  Microbial Community Composition and Diversity via 16S rRNA Gene Amplicons: Evaluating the Illumina Platform , 2014, bioRxiv.

[63]  F. Chen,et al.  Experimental factors affecting PCR-based estimates of microbial species richness and evenness , 2010, The ISME Journal.

[64]  Sharon L. Grim,et al.  Analysis, Optimization and Verification of Illumina-Generated 16S rRNA Gene Amplicon Surveys , 2014, PloS one.

[65]  E. C. Pielou The measurement of diversity in different types of biological collections , 1966 .

[66]  S. Tringe,et al.  High-Throughput Metagenomic Technologies for Complex Microbial Community Analysis: Open and Closed Formats , 2015, mBio.

[67]  Hong-Wei Zhou,et al.  Comparison of direct boiling method with commercial kits for extracting fecal microbiome DNA by Illumina sequencing of 16S rRNA tags. , 2013, Journal of microbiological methods.

[68]  C. Quince,et al.  Comparative metagenomic and rRNA microbial diversity characterization using archaeal and bacterial synthetic communities. , 2013, Environmental microbiology.

[69]  S. Bougouffa,et al.  Pyrosequencing Reveals the Microbial Communities in the Red Sea Sponge Carteriospongia foliascens and Their Impressive Shifts in Abnormal Tissues , 2014, Microbial Ecology.

[70]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[71]  Yong Kong,et al.  Btrim: A fast, lightweight adapter and quality trimming program for next-generation sequencing technologies , 2011, Genomics.

[72]  M. Tamplin,et al.  Dynamics of Seawater Bacterial Communities in a Shellfish Hatchery , 2013, Microbial Ecology.

[73]  Mehrdad Hajibabaei,et al.  Simultaneous assessment of the macrobiome and microbiome in a bulk sample of tropical arthropods through DNA metasystematics , 2014, Proceedings of the National Academy of Sciences.

[74]  Susan M. Huse,et al.  Microbial diversity in the deep sea and the underexplored “rare biosphere” , 2006, Proceedings of the National Academy of Sciences.

[75]  L. Raskin,et al.  PCR Biases Distort Bacterial and Archaeal Community Structure in Pyrosequencing Datasets , 2012, PloS one.

[76]  Jizhong Zhou,et al.  Microbial responses to southward and northward Cambisol soil transplant , 2015, MicrobiologyOpen.

[77]  Jizhong Zhou,et al.  Reproducibility and quantitation of amplicon sequencing-based detection , 2011, The ISME Journal.

[78]  Hairong Duan,et al.  Benefits and Challenges with Applying Unique Molecular Identifiers in Next Generation Sequencing to Detect Low Frequency Mutations , 2016, PloS one.