Minimizing polymerase biases in metabarcoding.

DNA metabarcoding is an increasingly popular method to characterize and quantify biodiversity in environmental samples. Metabarcoding approaches simultaneously amplify a short, variable genomic region, or "barcode," from a broad taxonomic group via the polymerase chain reaction (PCR), using universal primers that anneal to flanking conserved regions. Results of these experiments are reported as occurrence data, which provide a list of taxa amplified from the sample, or relative abundance data, which measure the relative contribution of each taxon to the overall composition of amplified product. The accuracy of both occurrence and relative abundance estimates can be affected by a variety of biological and technical biases. For example, taxa with larger biomass may be better represented in environmental samples than those with smaller biomass. Here, we explore how polymerase choice, a potential source of technical bias, might influence results in metabarcoding experiments. We compared potential biases of six commercially available polymerases using a combination of mixtures of amplifiable synthetic sequences and real sedimentary DNA extracts. We find that polymerase choice can affect both occurrence and relative abundance estimates and that the main source of this bias appears to be polymerase preference for sequences with specific GC contents. We further recommend an experimental approach for metabarcoding based on results of our synthetic experiments.

[1]  Z. Ning,et al.  Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of GC-biased genomes , 2009, Nature Methods.

[2]  Robert A. Edwards,et al.  Quality control and preprocessing of metagenomic datasets , 2011, Bioinform..

[3]  Masayuki Ushio,et al.  Environmental DNA enables detection of terrestrial mammals from forest pond water , 2016, bioRxiv.

[4]  K. Lukyanov,et al.  Regulation of average length of complex PCR product. , 1999, Nucleic Acids Research.

[5]  E. Maarel,et al.  Circumpolar arctic vegetation: Introduction and perspectives , 1994 .

[6]  H. Poinar,et al.  Ancient DNA: Do It Right or Not at All , 2000, Science.

[7]  Pierre Taberlet,et al.  Using metabarcoding to reveal and quantify plant-pollinator interactions , 2016, Scientific Reports.

[8]  M. Hindell,et al.  Studying Seabird Diet through Genetic Analysis of Faeces: A Case Study on Macaroni Penguins (Eudyptes chrysolophus) , 2007, PloS one.

[9]  E. Preble,et al.  I. BIRDS AND MAMMALS , 1923 .

[10]  C. Vollmers,et al.  Highly Accurate Sequencing of Full-Length Immune Repertoire Amplicons Using Tn5-Enabled and Molecular Identifier–Guided Amplicon Assembly , 2016, The Journal of Immunology.

[11]  B. Deagle,et al.  Quantifying sequence proportions in a DNA‐based diet study using Ion Torrent amplicon sequencing: which counts count? , 2013, Molecular ecology resources.

[12]  P. Taberlet,et al.  DNA from soil mirrors plant taxonomic and growth form diversity. , 2012, Molecular ecology.

[13]  James Haile,et al.  Ancient Biomolecules from Deep Ice Cores Reveal a Forested Southern Greenland , 2007, Science.

[14]  Kristy Deiner,et al.  Environmental DNA metabarcoding: Transforming how we survey animal and plant communities , 2017, Molecular ecology.

[15]  H. L. Sanders,et al.  Marine Benthic Diversity: A Comparative Study , 1968, The American Naturalist.

[16]  Matthias Meyer,et al.  Illumina sequencing library preparation for highly multiplexed target capture and sequencing. , 2010, Cold Spring Harbor protocols.

[17]  M. Uyttendaele,et al.  Microbial community profiling of fresh basil and pitfalls in taxonomic assignment of enterobacterial pathogenic species based upon 16S rRNA amplicon sequencing. , 2017, International journal of food microbiology.

[18]  S. Weissman,et al.  Uniform amplification of a mixture of deoxyribonucleic acids with varying GC content. , 1996, Genome research.

[19]  Pierre Taberlet,et al.  Influence of management practices on large herbivore diet—Case of European bison in Białowieża Primeval Forest (Poland) , 2011 .

[20]  Eske Willerslev,et al.  Postglacial viability and colonization in North America’s ice-free corridor , 2016, Nature.

[21]  H. Verbruggen,et al.  Multi-marker metabarcoding of coral skeletons reveals a rich microbiome and diverse evolutionary origins of endolithic algae , 2016, Scientific Reports.

[22]  E. S. Melnikov,et al.  The Circumpolar Arctic vegetation map , 2005 .

[23]  W. L. Chadderton,et al.  “Sight‐unseen” detection of rare aquatic species using environmental DNA , 2011 .

[24]  S. Talbot,et al.  Numerical classification of the coastal vegetation of Attu Island, Aleutian Islands, Alaska , 1994 .

[25]  M. Barucca,et al.  Preservation, origin and genetic imprint of extracellular DNA in permanently anoxic deep‐sea sediments , 2011, Molecular ecology.

[26]  L. Orlando,et al.  Population characteristics of a large whale shark aggregation inferred from seawater environmental DNA , 2016, Nature Ecology &Evolution.

[27]  Carsten Wiuf,et al.  Diverse Plant and Animal Genetic Records from Holocene and Pleistocene Sediments , 2003, Science.

[28]  L. Orlando,et al.  Meta‐barcoding of ‘dirt’ DNA from soil reflects vertebrate biodiversity , 2012, Molecular ecology.

[29]  Andrew S. Buxton,et al.  Seasonal variation in environmental DNA in relation to population size and environmental factors , 2017, Scientific Reports.

[30]  R. Stadhouders,et al.  The effect of primer-template mismatches on the detection and quantification of nucleic acids using the 5' nuclease assay. , 2010, The Journal of molecular diagnostics : JMD.

[31]  James Haile,et al.  Ancient and modern environmental DNA , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[32]  Donald A. Walker,et al.  The Circumpolar Arctic vegetation map , 2005 .

[33]  P. Taberlet,et al.  Replication levels, false presences and the estimation of the presence/absence from eDNA metabarcoding data , 2015, Molecular ecology resources.

[34]  C. Thermes,et al.  Library preparation methods for next-generation sequencing: tone down the bias. , 2014, Experimental cell research.

[35]  P. Taberlet,et al.  New perspectives in diet analysis based on DNA barcoding and parallel pyrosequencing: the trnL approach , 2009, Molecular ecology resources.

[36]  H. Birks,et al.  How have studies of ancient DNA from sediments contributed to the reconstruction of Quaternary floras? , 2016, The New phytologist.

[37]  James Haile,et al.  Ancient DNA reveals late survival of mammoth and horse in interior Alaska , 2009, Proceedings of the National Academy of Sciences.

[38]  L. Raskin,et al.  PCR Biases Distort Bacterial and Archaeal Community Structure in Pyrosequencing Datasets , 2012, PloS one.

[39]  B. Nielsen,et al.  Chloroplast DNA Copy Number Changes during Plant Development in Organelle DNA Polymerase Mutants , 2016, Front. Plant Sci..

[40]  D. Baird,et al.  Large-Scale Monitoring of Plants through Environmental DNA Metabarcoding of Soil: Recovery, Resolution, and Annotation of Four DNA Markers , 2016, PloS one.

[41]  E. Hultén Flora of the Aleutian Islands and westernmost Alaska Peninsula with notes on the flora of Commander Islands , 1937 .

[42]  Jesse Dabney,et al.  Length and GC-biases during sequencing library amplification: a comparison of various polymerase-buffer systems with ancient and modern DNA sequencing libraries. , 2012, BioTechniques.

[43]  N. Rawlence,et al.  Using palaeoenvironmental DNA to reconstruct past environments: progress and prospects , 2014 .

[44]  R. Jovani,et al.  PCR cycles above routine numbers do not compromise high-throughput DNA barcoding results. , 2017, Genome.

[45]  S. Henkel,et al.  Assessing differences in macrofaunal assemblages as a factor of sieve mesh size, distance between samples, and time of sampling , 2017, Environmental Monitoring and Assessment.

[46]  J. P. Collins,et al.  Site occupancy models in the analysis of environmental DNA presence/absence surveys: a case study of an emerging amphibian pathogen , 2013 .

[47]  R. Nielsen,et al.  Ancient DNA chronology within sediment deposits: are paleobiological reconstructions possible and is DNA leaching a factor? , 2007, Molecular biology and evolution.

[48]  D. Reich,et al.  Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture , 2012, Genome research.

[49]  A. Chao,et al.  iNEXT: an R package for rarefaction and extrapolation of species diversity (Hill numbers) , 2016 .

[50]  E. Hultén,et al.  Flora of Alaska and Neighboring Territories: A Manual of the Vascular Plants. , 1969 .

[51]  L. Weyrich,et al.  Comparison of environmental DNA metabarcoding and conventional fish survey methods in a river system , 2016 .

[52]  Thierry Vermat,et al.  Power and limitations of the chloroplast trnL (UAA) intron for plant DNA barcoding , 2006, Nucleic acids research.

[53]  J. Piñol,et al.  Universal and blocking primer mismatches limit the use of high‐throughput DNA sequencing for the quantitative metabarcoding of arthropods , 2015, Molecular ecology resources.

[54]  P. Taberlet,et al.  obitools: a unix‐inspired software package for DNA metabarcoding , 2016, Molecular ecology resources.

[55]  W. Röling,et al.  Sensitive life detection strategies for low-biomass environments: optimizing extraction of nucleic acids adsorbing to terrestrial and Mars analogue minerals. , 2012, FEMS microbiology ecology.

[56]  J. McLachlan,et al.  Ancient DNA from lake sediments: Bridging the gap between paleoecology and genetics , 2011, BMC Evolutionary Biology.

[57]  P. Colinvaux Historical Ecology in Beringia: The South Land Bridge Coast at St. Paul Island , 1981, Quaternary Research.

[58]  P. Taberlet,et al.  Fifty Thousand Years of Arctic Vegetation and Megafaunal Diet 1 Reconstruction of Arctic Vegetation from Permafrost Samples 121 , 2022 .

[59]  Lee A. Newsom,et al.  Timing and causes of mid-Holocene mammoth extinction on St. Paul Island, Alaska , 2016, Proceedings of the National Academy of Sciences.

[60]  R. Dorazio,et al.  ednaoccupancy: An r package for multiscale occupancy modelling of environmental DNA data , 2018, Molecular ecology resources.

[61]  T. Fennell,et al.  Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries , 2011, Genome Biology.

[62]  S. Giovannoni,et al.  Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR , 1996, Applied and environmental microbiology.

[63]  Martin F. Polz,et al.  Bias in Template-to-Product Ratios in Multitemplate PCR , 1998, Applied and Environmental Microbiology.

[64]  James A. Casbon,et al.  A method for counting PCR template molecules with application to next-generation sequencing , 2011, Nucleic acids research.

[65]  Marcin Łoś,et al.  The choice of the DNA extraction method may influence the outcome of the soil microbial community structure analysis , 2017, MicrobiologyOpen.

[66]  Daniel H. Huson,et al.  Neanderthal behaviour, diet, and disease inferred from ancient DNA in dental calculus , 2017, Nature.

[67]  Vasco Elbrecht,et al.  Can DNA-Based Ecosystem Assessments Quantify Species Abundance? Testing Primer Bias and Biomass—Sequence Relationships with an Innovative Metabarcoding Protocol , 2015, PloS one.

[68]  Åsa Johansson,et al.  Corrigendum: 1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function , 2017, Scientific Reports.

[69]  John A Darling,et al.  From molecules to management: adopting DNA-based methods for monitoring biological invasions in aquatic environments. , 2011, Environmental research.

[70]  Jun Ying Lim,et al.  Estimating and mitigating amplification bias in qualitative and quantitative arthropod metabarcoding , 2017, Scientific Reports.

[71]  Kristine Bohmann,et al.  Tag jumps illuminated – reducing sequence‐to‐sample misidentifications in metabarcoding studies , 2015, Molecular ecology resources.

[72]  B. Brosi,et al.  An rbcL reference library to aid in the identification of plant species mixtures by DNA metabarcoding1 , 2017, Applications in Plant Sciences.

[73]  F. Inagaki,et al.  Application of Stochastic Labeling with Random-Sequence Barcodes for Simultaneous Quantification and Sequencing of Environmental 16S rRNA Genes , 2017, PloS one.

[74]  P. Taberlet,et al.  Using next‐generation sequencing for molecular reconstruction of past Arctic vegetation and climate , 2010, Molecular ecology resources.

[75]  V. Savolainen,et al.  Behavior and season affect crayfish detection and density inference using environmental DNA , 2017, Ecology and evolution.

[76]  P. Taberlet,et al.  DNA metabarcoding multiplexing and validation of data accuracy for diet assessment: application to omnivorous diet , 2014, Molecular ecology resources.

[77]  Jan Kieleczawa,et al.  Fundamentals of sequencing of difficult templates--an overview. , 2006, Journal of biomolecular techniques : JBT.

[78]  M. Wilkinson,et al.  Quantitative evaluation of bias in PCR amplification and next-generation sequencing derived from metabarcoding samples , 2015, Analytical and Bioanalytical Chemistry.

[79]  José J Lahoz-Monfort,et al.  Statistical approaches to account for false‐positive errors in environmental DNA samples , 2016, Molecular ecology resources.

[80]  Martin Kircher,et al.  Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform , 2011, Nucleic acids research.

[81]  K. Stoof-Leichsenring,et al.  A comparison of sedimentary DNA and pollen from lake sediments in recording vegetation composition at the Siberian treeline , 2017, Molecular ecology resources.

[82]  Matthew A. Barnes,et al.  The ecology of environmental DNA and implications for conservation genetics , 2016, Conservation Genetics.

[83]  Pierre Taberlet,et al.  Analysing diet of small herbivores: the efficiency of DNA barcoding coupled with high-throughput pyrosequencing for deciphering the composition of complex plant mixtures , 2009, Frontiers in Zoology.

[84]  Nancy Knowlton,et al.  Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding , 2017, PeerJ.

[85]  P. Taberlet,et al.  Long livestock farming history and human landscape shaping revealed by lake sediment DNA , 2014, Nature Communications.

[86]  E. Preble,et al.  A biological survey of the Pribilof Islands, Alaska , 1923 .

[87]  P. Taberlet,et al.  Islands in the ice: detecting past vegetation on Greenlandic nunataks using historical records and sedimentary ancient DNA Meta‐barcoding , 2012, Molecular ecology.