Systematic improvement of amplicon marker gene methods for increased accuracy in microbiome studies

Amplicon-based marker gene surveys form the basis of most microbiome and other microbial community studies. Such PCR-based methods have multiple steps, each of which is susceptible to error and bias. Variance in results has also arisen through the use of multiple methods of next-generation sequencing (NGS) amplicon library preparation. Here we formally characterized errors and biases by comparing different methods of amplicon-based NGS library preparation. Using mock community standards, we analyzed the amplification process to reveal insights into sources of experimental error and bias in amplicon-based microbial community and microbiome experiments. We present a method that improves on the current best practices and enables the detection of taxonomic groups that often go undetected with existing methods.

[1]  Fei Zou,et al.  BIPES, a cost-effective high-throughput method for assessing microbial diversity , 2011, The ISME Journal.

[2]  Katherine H. Huang,et al.  Structure, Function and Diversity of the Healthy Human Microbiome , 2012, Nature.

[3]  Jizhong Zhou,et al.  Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities , 2013, mBio.

[4]  J. Ravel,et al.  Evaluation of Methods for the Extraction and Purification of DNA from the Human Microbiome , 2012, PloS one.

[5]  J. Clemente,et al.  Human gut microbiome viewed across age and geography , 2012, Nature.

[6]  Tanja Woyke,et al.  Metagenomics uncovers gaps in amplicon-based detection of microbial diversity , 2016, Nature Microbiology.

[7]  J. W. Pendleton,et al.  Surveys of Gene Families Using Polymerase Chain Reaction: PCR Selection and PCR Drift , 1994 .

[8]  H. Ochman,et al.  Illumina-based analysis of microbial community diversity , 2011, The ISME Journal.

[9]  Marcus J. Claesson,et al.  Composition, variability, and temporal stability of the intestinal microbiota of the elderly , 2010, Proceedings of the National Academy of Sciences.

[10]  M. Blaser,et al.  Antibiotics in early life alter the murine colonic microbiome and adiposity , 2012, Nature.

[11]  Paul Turner,et al.  Reagent and laboratory contamination can critically impact sequence-based microbiome analyses , 2014, BMC Biology.

[12]  N. Pace,et al.  Differential amplification of rRNA genes by polymerase chain reaction , 1992, Applied and environmental microbiology.

[13]  J. Petrosino,et al.  Effect of Sample Storage Conditions on Culture-Independent Bacterial Community Measures in Cystic Fibrosis Sputum Specimens , 2011, Journal of Clinical Microbiology.

[14]  B. Birren,et al.  Genomic analysis identifies association of Fusobacterium with colorectal carcinoma. , 2012, Genome research.

[15]  Sharon L. Grim,et al.  Analysis, Optimization and Verification of Illumina-Generated 16S rRNA Gene Amplicon Surveys , 2014, PloS one.

[16]  Lynn K. Carmichael,et al.  Evaluation of 16S rDNA-Based Community Profiling for Human Microbiome Research , 2012, PloS one.

[17]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[18]  Stephen J. Salipante,et al.  Performance Comparison of Illumina and Ion Torrent Next-Generation Sequencing Platforms for 16S rRNA-Based Bacterial Community Profiling , 2014, Applied and Environmental Microbiology.

[19]  Sarah L. Westcott,et al.  Development of a Dual-Index Sequencing Strategy and Curation Pipeline for Analyzing Amplicon Sequence Data on the MiSeq Illumina Sequencing Platform , 2013, Applied and Environmental Microbiology.

[20]  William A. Walters,et al.  Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms , 2012, The ISME Journal.

[21]  M. Fukui,et al.  Optimization of Annealing Temperature To Reduce Bias Caused by a Primer Mismatch in Multitemplate PCR , 2001, Applied and Environmental Microbiology.

[22]  Marcus J. Claesson,et al.  Comparison of two next-generation sequencing technologies for resolving highly complex microbiota composition using tandem variable 16S rRNA gene regions , 2010, Nucleic acids research.

[23]  Patrick D. Schloss,et al.  Reducing the Effects of PCR Amplification and Sequencing Artifacts on 16S rRNA-Based Studies , 2011, PloS one.

[24]  P. Mieczkowski,et al.  Practical innovations for high-throughput amplicon sequencing , 2013, Nature Methods.

[25]  K. Eric Wommack,et al.  Groundtruthing Next-Gen Sequencing for Microbial Ecology–Biases and Errors in Community Structure Estimates from PCR Amplicon Pyrosequencing , 2012, PloS one.

[26]  Jae-Hyung Ahn,et al.  Effects of PCR cycle number and DNA polymerase type on the 16S rRNA gene pyrosequencing analysis of bacterial communities , 2012, Journal of Microbiology.

[27]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[28]  Jennifer M. Fettweis,et al.  The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies , 2015, BMC Microbiology.

[29]  Se Jin Song,et al.  The treatment-naive microbiome in new-onset Crohn's disease. , 2014, Cell host & microbe.

[30]  G. Wang,et al.  The frequency of chimeric molecules as a consequence of PCR co-amplification of 16S rRNA genes from different bacterial species. , 1996, Microbiology.

[31]  C. Huttenhower,et al.  The microbiome quality control project: baseline study design and future directions , 2015, Genome Biology.

[32]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[33]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[34]  Andrea K. Bartram,et al.  Generation of Multimillion-Sequence 16S rRNA Gene Libraries from Complex Microbial Communities by Assembling Paired-End Illumina Reads , 2011, Applied and Environmental Microbiology.

[35]  J. Venter,et al.  Library preparation methodology can influence genomic and functional predictions in human microbiome research , 2015, Proceedings of the National Academy of Sciences.

[36]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[37]  E. Shapiro,et al.  Bacteriology of the maxillary sinuses in patients with cystic fibrosis. , 1982, The Journal of infectious diseases.

[38]  T. Kunkel,et al.  DNA replication fidelity. , 1992, The Journal of biological chemistry.

[39]  D. Bru,et al.  Quantification of the Detrimental Effect of a Single Primer-Template Mismatch by Real-Time PCR Using the 16S rRNA Gene as an Example , 2008, Applied and Environmental Microbiology.

[40]  Victor Kunin,et al.  Effects of OTU Clustering and PCR Artifacts on Microbial Diversity Estimates , 2012, Microbial Ecology.

[41]  C. Quince,et al.  Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform , 2015, Nucleic acids research.

[42]  B. Haas,et al.  Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. , 2011, Genome research.

[43]  Rob Knight,et al.  The Earth Microbiome project: successes and aspirations , 2014, BMC Biology.

[44]  Fernando Azpiroz,et al.  Storage conditions of intestinal microbiota matter in metagenomic analysis , 2012, BMC Microbiology.

[45]  A. Klindworth,et al.  Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies , 2012, Nucleic acids research.

[46]  L. Raskin,et al.  PCR Biases Distort Bacterial and Archaeal Community Structure in Pyrosequencing Datasets , 2012, PloS one.

[47]  Dan-Ping Mao,et al.  Coverage evaluation of universal bacterial primers using the metagenomic datasets , 2012, BMC Microbiology.

[48]  H. Drummond,et al.  The Impact of Different DNA Extraction Kits and Laboratories upon the Assessment of Human Gut Microbiota Composition by 16S rRNA Gene Sequencing , 2014, PloS one.

[49]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[50]  William A. Walters,et al.  Experimental and analytical tools for studying the human microbiome , 2011, Nature Reviews Genetics.

[51]  Michael W. Hall,et al.  Evaluating Bias of Illumina-Based Bacterial 16S rRNA Gene Profiles , 2014, Applied and Environmental Microbiology.

[52]  M. Blaser,et al.  The human microbiome: at the interface of health and disease , 2012, Nature Reviews Genetics.

[53]  Daniel G. Brown,et al.  PANDAseq: paired-end assembler for illumina sequences , 2012, BMC Bioinformatics.

[54]  John G Kenny,et al.  A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling , 2016, BMC Genomics.

[55]  R. S. Shmookler Reis,et al.  Discrimination of primer 3'-nucleotide mismatch by taq DNA polymerase during polymerase chain reaction. , 2000, Analytical biochemistry.

[56]  Jesse R. Zaneveld,et al.  Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences , 2013, Nature Biotechnology.

[57]  J. Bunge,et al.  Polymerase chain reaction primers miss half of rRNA microbial diversity , 2009, The ISME Journal.

[58]  Zhongtang Yu,et al.  Improved extraction of PCR-quality community DNA from digesta and fecal samples. , 2004, BioTechniques.

[59]  Jean M. Macklaim,et al.  Microbiome Profiling by Illumina Sequencing of Combinatorial Sequence-Tagged PCR Products , 2010, PLoS ONE.

[60]  Jacques Ravel,et al.  An improved dual-indexing approach for multiplexed 16S rRNA gene sequencing on the Illumina MiSeq platform , 2014, Microbiome.

[61]  C. Blackwood,et al.  Assessment of Bias Associated with Incomplete Extraction of Microbial DNA from Soil , 2009, Applied and Environmental Microbiology.

[62]  T. Fennell,et al.  Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries , 2011, Genome Biology.

[63]  S. Giovannoni,et al.  Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR , 1996, Applied and environmental microbiology.

[64]  Martin F. Polz,et al.  Bias in Template-to-Product Ratios in Multitemplate PCR , 1998, Applied and Environmental Microbiology.

[65]  William A. Walters,et al.  Conducting a Microbiome Study , 2014, Cell.

[66]  J. Clemente,et al.  The Long-Term Stability of the Human Gut Microbiota , 2013 .

[67]  G. Wang,et al.  Frequency of formation of chimeric molecules as a consequence of PCR coamplification of 16S rRNA genes from mixed bacterial genomes , 1997, Applied and environmental microbiology.

[68]  Katherine H. Huang,et al.  A framework for human microbiome research , 2012, Nature.

[69]  Michael A Quail,et al.  Optimal enzymes for amplifying sequencing libraries , 2011, Nature Methods.

[70]  Tim Tolker-Nielsen,et al.  Biased 16S rDNA PCR amplification caused by interference from DNA flanking the template region , 1998 .

[71]  M. Wolfgang,et al.  Detection of anaerobic bacteria in high numbers in sputum from patients with cystic fibrosis. , 2008, American journal of respiratory and critical care medicine.

[72]  Xiao-Tao Jiang,et al.  Effects of polymerase, template dilution and cycle number on PCR based 16 S rRNA diversity analysis using the deep sequencing method , 2010, BMC Microbiology.

[73]  Daniel J. G. Lahr,et al.  Reducing the impact of PCR-mediated recombination in molecular evolution and environmental studies using a new-generation high-fidelity DNA polymerase. , 2009, BioTechniques.

[74]  Brian C. Thomas,et al.  Unusual biology across a group comprising more than 15% of domain Bacteria , 2015, Nature.