Optimizing ddRADseq in Non-Model Species: A Case Study in Eucalyptus dunnii Maiden

Restriction site-associated DNA sequencing (RADseq) and its derived protocols, such as double digest RADseq (ddRADseq), offer a flexible and highly cost-effective strategy for efficient plant genome sampling. This has become one of the most popular genotyping approaches for breeding, conservation, and evolution studies in model and non-model plant species. However, universal protocols do not always adapt well to non-model species. Herein, this study reports the development of an optimized and detailed ddRADseq protocol in Eucalyptus dunnii, a non-model species, which combines different aspects of published methodologies. The initial protocol was established using only two samples by selecting the best combination of enzymes and through optimal size selection and simplifying lab procedures. Both single nucleotide polymorphisms (SNPs) and simple sequence repeats (SSRs) were determined with high accuracy after applying stringent bioinformatics settings and quality filters, with and without a reference genome. To scale it up to 24 samples, we added barcoded adapters. We also applied automatic size selection, and therefore obtained an optimal number of loci, the expected SNP locus density, and genome-wide distribution. Reliability and cross-sequencing platform compatibility were verified through dissimilarity coefficients of 0.05 between replicates. To our knowledge, this optimized ddRADseq protocol will allow users to go from the DNA sample to genotyping data in a highly accessible and reproducible way.

[1]  Bruno Marco de Lima,et al.  Improving genomic prediction of growth and wood traits in Eucalyptus using phenotypes from non-genotyped trees by single-step GBLUP. , 2019, Plant science : an international journal of experimental plant biology.

[2]  N. Paniego,et al.  De novo transcriptome sequencing and SSR markers development for Cedrela balansae C.DC., a native tree species of northwest Argentina , 2018, PloS one.

[3]  D. Grattapaglia,et al.  Fast and inexpensive protocols for consistent extraction of high quality DNA and RNA from challenging plant and fungal samples for high-throughput SNP genotyping and sequencing applications , 2018, PloS one.

[4]  D. Grattapaglia,et al.  Independent and Joint-GWAS for growth traits in Eucalyptus by assembling genome-wide data for 3373 individuals across four breeding populations. , 2018, The New phytologist.

[5]  Lennart Opitz,et al.  Long fragments achieve lower base quality in Illumina paired-end sequencing , 2018, Scientific Reports.

[6]  H. Dungey,et al.  Efficiency of genomic prediction across two Eucalyptus nitens seed orchards with different selection histories , 2018, Heredity.

[7]  Joshua P. Jahner,et al.  RADseq approaches and applications for forest tree genetics , 2018, Tree Genetics & Genomes.

[8]  F. Leese,et al.  ddrage: A data set generator to evaluate ddRADseq analysis software , 2018, Molecular ecology resources.

[9]  S. Valenzuela,et al.  Efficiency of EUChip60K pipeline in fingerprinting clonal population of Eucalyptus globulus , 2018, Trees.

[10]  B. Boyle,et al.  Efficient genome-wide genotyping strategies and data integration in crop plants , 2018, Theoretical and Applied Genetics.

[11]  K. Hudson,et al.  A comparison of genotyping-by-sequencing analysis methods on low-coverage crop datasets shows advantages of a new workflow, GB-eaSy , 2017, BMC Bioinformatics.

[12]  Nicolas C Rochette,et al.  Deriving genotypes from RAD-seq short-read data using Stacks , 2017, Nature Protocols.

[13]  C. Davis,et al.  Cross‐platform compatibility of de novo‐aligned SNPs in a nonmodel butterfly genus , 2017, Molecular ecology resources.

[14]  L. Lohmann,et al.  Minimum sample sizes for population genomics: an empirical study from an Amazonian plant species , 2017, Molecular ecology resources.

[15]  S. Bonos,et al.  A first linkage map and downy mildew resistance QTL discovery for sweet basil (Ocimum basilicum) facilitated by double digestion restriction site associated DNA sequencing (ddRADseq) , 2017, PloS one.

[16]  Yiqiang Zhao,et al.  Optimized double-digest genotyping by sequencing (ddGBS) method with high-density SNP markers and high genotyping accuracy for chickens , 2017, PloS one.

[17]  Jie Liu,et al.  Using MiddRAD-seq data to develop polymorphic microsatellite markers for an endangered yew species , 2017, Plant diversity.

[18]  Oscar M. Vargas,et al.  Conflicting phylogenomic signals reveal a pattern of reticulate evolution in a recent high-Andean diversification (Asteraceae: Astereae: Diplostephium). , 2017, The New phytologist.

[19]  J. Batley,et al.  Genotyping‐by‐sequencing approaches to characterize crop genomes: choosing the right tool for the right application , 2017, Plant biotechnology journal.

[20]  Davoud Torkamaneh,et al.  Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies , 2016, PloS one.

[21]  De‐Zhu Li,et al.  Development of a universal and simplified ddRAD library preparation approach for SNP discovery and genotyping in angiosperm plants , 2016, Plant Methods.

[22]  Charlotte C. Germain-Aubrey,et al.  The report of my death was an exaggeration: A review for researchers using microsatellites in the 21st century1 , 2016, Applications in Plant Sciences.

[23]  G. Luikart,et al.  Harnessing the power of RADseq for ecological and evolutionary genomics , 2016, Nature Reviews Genetics.

[24]  R. Testolin,et al.  A RAD-based linkage map of kiwifruit (Actinidia chinensis Pl.) as a tool to improve the genome assembly and to scan the genomic region of the gender determinant for the marker-assisted breeding , 2015, Tree Genetics & Genomes.

[25]  E. G. Boulding,et al.  Low-cost ddRAD method of SNP discovery and genotyping applied to the periwinkle Littorina saxatilis , 2015 .

[26]  Bernardo J. Clavijo,et al.  A method to simultaneously construct up to 12 differently sized Illumina Nextera long mate pair libraries with reduced DNA input, time, and cost. , 2015, BioTechniques.

[27]  D. Grattapaglia,et al.  A flexible multi-species genome-wide 60K SNP chip developed from pooled resequencing of 240 Eucalyptus tree genomes across 12 species. , 2015, The New phytologist.

[28]  O. Lepais,et al.  SimRAD: an R package for simulation‐based prediction of the number of loci expected in RADseq and similar genotyping by sequencing approaches , 2014, Molecular ecology resources.

[29]  Yibo Dong,et al.  Genotyping-By-Sequencing for Plant Genetic Diversity Analysis: A Lab Guide for SNP Genotyping , 2014 .

[30]  J. DaCosta,et al.  Amplification Biases and Consistent Recovery of Loci in a Double-Digest RAD-seq Protocol , 2014, PloS one.

[31]  Richard D. Hayes,et al.  The genome of Eucalyptus grandis , 2014, Nature.

[32]  Yulin Chen,et al.  Construction of a SNP-based genetic linkage map in cultivated peanut based on large scale marker development using next-generation double-digest restriction-site-associated DNA sequencing (ddRADseq) , 2014, BMC Genomics.

[33]  Jürgen Sauter,et al.  Cost-efficient high-throughput HLA typing by MiSeq amplicon sequencing , 2014, BMC Genomics.

[34]  Christopher E. Bird,et al.  ezRAD: a simplified method for genomic genotyping in non-model organisms , 2013, PeerJ.

[35]  M. Kimmel,et al.  Factors Influencing Ascertainment Bias of Microsatellite Allele Sizes: Impact on Estimates of Mutation Rates , 2013, Genetics.

[36]  Angel Amores,et al.  Stacks: an analysis tool set for population genomics , 2013, Molecular ecology.

[37]  David Levine,et al.  A high-performance computing toolset for relatedness and principal component analysis of SNP data , 2012, Bioinform..

[38]  Trevor W. Rife,et al.  Genotyping‐by‐Sequencing for Plant Breeding and Genetics , 2012 .

[39]  H. Hoekstra,et al.  Double Digest RADseq: An Inexpensive Method for De Novo SNP Discovery and Genotyping in Model and Non-Model Species , 2012, PloS one.

[40]  M. Matz,et al.  2b-RAD: a simple and flexible method for genome-wide genotyping , 2012, Nature Methods.

[41]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[42]  J. Poland,et al.  Development of High-Density Genetic Maps for Barley and Wheat Using a Novel Two-Enzyme Genotyping-by-Sequencing Approach , 2012, PloS one.

[43]  M. Varghese,et al.  High-throughput targeted SNP discovery using Next Generation Sequencing (NGS) in few selected candidate genes in Eucalyptus camaldulensis , 2011, BMC Proceedings.

[44]  D. Grattapaglia,et al.  Genome-wide genotyping and SNP discovery by ultra-deep Restriction-Associated DNA (RAD) tag sequencing of pooled samples of E. grandis and E. globulus , 2011, BMC Proceedings.

[45]  A. Amores,et al.  Stacks: Building and Genotyping Loci De Novo From Short-Read Sequences , 2011, G3: Genes | Genomes | Genetics.

[46]  M. Blaxter,et al.  Genome-wide genetic marker discovery and genotyping using next-generation sequencing , 2011, Nature Reviews Genetics.

[47]  G. Valè,et al.  Identification of SNP and SSR markers in eggplant using RAD tag sequencing , 2011, BMC Genomics.

[48]  Robert J. Elshire,et al.  A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species , 2011, PloS one.

[49]  R. Nielsen,et al.  Ascertainment biases in SNP chips affect measures of population divergence. , 2010, Molecular biology and evolution.

[50]  A. Kilian,et al.  Plant Methods Open Access Methodology Methodology a High-density Diversity Arrays Technology (dart) Microarray for Genome-wide Genotyping in Eucalyptus , 2022 .

[51]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[52]  C. Külheim,et al.  Comparative SNP diversity among four Eucalyptus species for genes from secondary metabolite biosynthetic pathways , 2009, BMC Genomics.

[53]  Richard Durbin,et al.  A large genome center's improvements to the Illumina sequencing system , 2008, Nature Methods.

[54]  P. Etter,et al.  Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers , 2008, PloS one.

[55]  H. Hopp,et al.  Selection of a seed orchard of Eucalyptus dunnii based on genetic diversity criteria calculated using molecular markers. , 2003, Tree physiology.

[56]  R. Varshney,et al.  Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.) , 2003, Theoretical and Applied Genetics.

[57]  G. Merino Imputación de genotipos faltantes en datos de secuenciación masiva , 2018 .

[58]  J. Anderson,et al.  Comparing Genotyping-by-Sequencing and Single Nucleotide Polymorphism Chip Genotyping for Quantitative Trait Loci Mapping in Wheat , 2016 .

[59]  D. D. Sarker,et al.  Assessment of genetic diversity among four orchids based on ddRAD sequencing data for conservation purposes , 2016, Physiology and Molecular Biology of Plants.

[60]  M. Warburton Laboratory Protocols CIMMYT Applied Molecular Genetics Laboratory , 2005 .