Genotyping by sequencing for genomic prediction in a soybean breeding population

BackgroundAdvances in genotyping technology, such as genotyping by sequencing (GBS), are making genomic prediction more attractive to reduce breeding cycle times and costs associated with phenotyping. Genomic prediction and selection has been studied in several crop species, but no reports exist in soybean. The objectives of this study were (i) evaluate prospects for genomic selection using GBS in a typical soybean breeding program and (ii) evaluate the effect of GBS marker selection and imputation on genomic prediction accuracy. To achieve these objectives, a set of soybean lines sampled from the University of Nebraska Soybean Breeding Program were genotyped using GBS and evaluated for yield and other agronomic traits at multiple Nebraska locations.ResultsGenotyping by sequencing scored 16,502 single nucleotide polymorphisms (SNPs) with minor-allele frequency (MAF) > 0.05 and percentage of missing values ≤ 5% on 301 elite soybean breeding lines. When SNPs with up to 80% missing values were included, 52,349 SNPs were scored. Prediction accuracy for grain yield, assessed using cross validation, was estimated to be 0.64, indicating good potential for using genomic selection for grain yield in soybean. Filtering SNPs based on missing data percentage had little to no effect on prediction accuracy, especially when random forest imputation was used to impute missing values. The highest accuracies were observed when random forest imputation was used on all SNPs, but differences were not significant. A standard additive G-BLUP model was robust; modeling additive-by-additive epistasis did not provide any improvement in prediction accuracy. The effect of training population size on accuracy began to plateau around 100, but accuracy steadily climbed until the largest possible size was used in this analysis. Including only SNPs with MAF > 0.30 provided higher accuracies when training populations were smaller.ConclusionsUsing GBS for genomic prediction in soybean holds good potential to expedite genetic gain. Our results suggest that standard additive G-BLUP models can be used on unfiltered, imputed GBS data without loss in accuracy.

[1]  M. Calus,et al.  Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding , 2013, Genetics.

[2]  José Crossa,et al.  Genomic Selection in Wheat Breeding using Genotyping‐by‐Sequencing , 2012 .

[3]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[4]  Yusheng Zhao,et al.  Accuracy of genomic selection in European maize elite breeding populations , 2011, Theoretical and Applied Genetics.

[5]  José Crossa,et al.  Genomic Prediction of Breeding Values when Modeling Genotype × Environment Interaction using Pedigree and Dense Molecular Markers , 2012 .

[6]  H. Gauch,et al.  Relatedness and Genotype × Environment Interaction Affect Prediction Accuracies in Genomic Selection: A Study in Cassava , 2013 .

[7]  Hansang Jung,et al.  Genomewide Selection versus Marker‐assisted Recurrent Selection to Improve Grain Yield and Stover‐quality Traits for Cellulosic Ethanol in Maize , 2013 .

[8]  Albrecht E. Melchinger,et al.  Genomic Prediction of Northern Corn Leaf Blight Resistance in Maize with Combined or Separated Training Sets for Heterotic Groups , 2013, G3: Genes | Genomes | Genetics.

[9]  Andrés Legarra,et al.  Performance of Genomic Selection in Mice , 2008, Genetics.

[10]  Daniel Gianola,et al.  Inferring genetic values for quantitative traits non-parametrically. , 2008, Genetics research.

[11]  Robert J. Elshire,et al.  Comprehensive genotyping of the USA national maize inbred seed bank , 2013, Genome Biology.

[12]  Shizhong Xu Mapping Quantitative Trait Loci by Controlling Polygenic Background Effects , 2013, Genetics.

[13]  Brian Boyle,et al.  An Improved Genotyping by Sequencing (GBS) Approach Offering Increased Versatility and Efficiency of SNP Discovery and Genotyping , 2013, PloS one.

[14]  Trevor W. Rife,et al.  Genotyping‐by‐Sequencing for Plant Breeding and Genetics , 2012 .

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  Robert J. Elshire,et al.  A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species , 2011, PloS one.

[17]  Hsiao-Pei Yang,et al.  Genomic Selection in Plant Breeding: A Comparison of Models , 2012 .

[18]  J. Poland,et al.  Impact of Marker Ascertainment Bias on Genomic Selection Accuracy and Estimates of Genetic Diversity , 2013, PloS one.

[19]  C. R. Henderson Applications of linear models in animal breeding , 1984 .

[20]  Daniel Gianola,et al.  Using Whole-Genome Sequence Data to Predict Quantitative Trait Phenotypes in Drosophila melanogaster , 2012, PLoS genetics.

[21]  Jeffery A. Thompson,et al.  Context-specific marker-assisted selection for improved grain yield in elite soybean populations. , 2010 .

[22]  R. Nelson,et al.  QTL associated with yield in three backcross-derived populations of soybean , 2007 .

[23]  Jean-Luc Jannink,et al.  Imputation of Unordered Markers and the Impact on Genomic Selection Accuracy , 2013, G3: Genes, Genomes, Genetics.

[24]  P. Cregan,et al.  Two Microsatellite Markers That Flank the Major Soybean Cyst Nematode Resistance Locus , 1997 .

[25]  Kevin P. Smith,et al.  Potential and Optimization of Genomic Selection for Fusarium Head Blight Resistance in Six-Row Barley , 2012 .

[26]  José Crossa,et al.  Semi-parametric genomic-enabled prediction of genetic values using reproducing kernel Hilbert spaces methods. , 2010, Genetics research.

[27]  K. Chase,et al.  Genetics of soybean agronomic traits: I. Comparison of three related recombinant inbred populations , 1999 .

[28]  J. Meyer,et al.  Introgression of a quantitative trait locus for yield from Glycine soja into commercial soybean cultivars , 2003, Theoretical and Applied Genetics.

[29]  Aaron J. Lorenz,et al.  Genomic Selection in Plant Breeding , 2011 .

[30]  José Crossa,et al.  Genomic Prediction in Maize Breeding Populations with Genotyping-by-Sequencing , 2013, G3: Genes, Genomes, Genetics.

[31]  M. Iqbal,et al.  Quantitative trait loci in Two Soybean Recombinant Inbred Line Populations Segregating for Yield and Disease Resistance. , 2002, Crop science.

[32]  C. Cockerham,et al.  An Extension of the Concept of Partitioning Hereditary Variance for Analysis of Covariances among Relatives When Epistasis Is Present. , 1954, Genetics.

[33]  Randall L. Nelson,et al.  Highly Variable Patterns of Linkage Disequilibrium in Multiple Soybean Populations , 2007, Genetics.

[34]  M. Stitt,et al.  Genomic and metabolic prediction of complex heterotic traits in hybrid maize , 2012, Nature Genetics.