Genomic Prediction in Maize Breeding Populations with Genotyping-by-Sequencing

Genotyping-by-sequencing (GBS) technologies have proven capacity for delivering large numbers of marker genotypes with potentially less ascertainment bias than standard single nucleotide polymorphism (SNP) arrays. Therefore, GBS has become an attractive alternative technology for genomic selection. However, the use of GBS data poses important challenges, and the accuracy of genomic prediction using GBS is currently undergoing investigation in several crops, including maize, wheat, and cassava. The main objective of this study was to evaluate various methods for incorporating GBS information and compare them with pedigree models for predicting genetic values of lines from two maize populations evaluated for different traits measured in different environments (experiments 1 and 2). Given that GBS data come with a large percentage of uncalled genotypes, we evaluated methods using nonimputed, imputed, and GBS-inferred haplotypes of different lengths (short or long). GBS and pedigree data were incorporated into statistical models using either the genomic best linear unbiased predictors (GBLUP) or the reproducing kernel Hilbert spaces (RKHS) regressions, and prediction accuracy was quantified using cross-validation methods. The following results were found: relative to pedigree or marker-only models, there were consistent gains in prediction accuracy by combining pedigree and GBS data; there was increased predictive ability when using imputed or nonimputed GBS data over inferred haplotype in experiment 1, or nonimputed GBS and information-based imputed short and long haplotypes, as compared to the other methods in experiment 2; the level of prediction accuracy achieved using GBS data in experiment 2 is comparable to those reported by previous authors who analyzed this data set using SNP arrays; and GBLUP and RKHS models with pedigree with nonimputed and imputed GBS data provided the best prediction correlations for the three traits in experiment 1, whereas for experiment 2 RKHS provided slightly better prediction than GBLUP for drought-stressed environments, and both models provided similar predictions in well-watered environments.

[1]  Edward S. Buckler,et al.  TASSEL: software for association mapping of complex traits in diverse samples , 2007, Bioinform..

[2]  F. Schenkel,et al.  Accuracy of genomic selection in simulated populations mimicking the extent of linkage disequilibrium in beef cattle , 2011, BMC Genetics.

[3]  K. Weigel,et al.  Predicting complex quantitative traits with Bayesian neural networks: a case study with Jersey cows and wheat , 2011, BMC Genetics.

[4]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[5]  José Crossa,et al.  Genomic‐Enabled Prediction Based on Molecular Markers and Pedigree Using the Bayesian Linear Regression Package in R , 2010, The plant genome.

[6]  Gustavo de los Campos,et al.  Inferences from Genomic Models in Stratified Populations , 2012, Genetics.

[7]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[8]  R. Fernando,et al.  Genomic-Assisted Prediction of Genetic Value With Semiparametric Procedures , 2006, Genetics.

[9]  D. Gianola,et al.  Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat , 2012, G3: Genes | Genomes | Genetics.

[10]  José Crossa,et al.  Genomic Selection in Wheat Breeding using Genotyping‐by‐Sequencing , 2012 .

[11]  T. Rocheford,et al.  Rare genetic variation at Zea mays crtRB1 increases β-carotene in maize grain , 2010, Nature Genetics.

[12]  D Gianola,et al.  Predictive ability of subsets of single nucleotide polymorphisms with and without parent average in US Holsteins. , 2010, Journal of dairy science.

[13]  T. Shah,et al.  Comparative SNP and Haplotype Analysis Reveals a Higher Genetic Diversity and Rapider LD Decay in Tropical than Temperate Germplasm in Maize , 2011, PloS one.

[14]  J. E. Cairns,et al.  Genome-enabled prediction of genetic values using radial basis function neural networks , 2012, Theoretical and Applied Genetics.

[15]  Jose Crossa,et al.  Effectiveness of Genomic Prediction of Maize Hybrid Performance in Different Breeding Populations and Environments , 2012, G3: Genes | Genomes | Genetics.

[16]  José Crossa,et al.  Factors Affecting the Accuracy of Genotype Imputation in Populations from Several Maize Breeding Programs , 2012 .

[17]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[18]  José Crossa,et al.  Predicting Quantitative Traits With Regression Models for Dense Molecular Markers and Pedigree , 2009, Genetics.

[19]  Robert J. Elshire,et al.  Comprehensive genotyping of the USA national maize inbred seed bank , 2013, Genome Biology.

[20]  Robert J. Elshire,et al.  A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species , 2011, PloS one.

[21]  G. de los Campos,et al.  Genomic Selection and Prediction in Plant Breeding , 2011 .

[22]  Hsiao-Pei Yang,et al.  Genomic Selection in Plant Breeding: A Comparison of Models , 2012 .

[23]  Patrick S. Schnable,et al.  Maize Inbreds Exhibit High Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in Genome Content , 2009, PLoS genetics.

[24]  P. VanRaden,et al.  Efficient methods to compute genomic predictions. , 2008, Journal of dairy science.

[25]  H. Hakonarson,et al.  SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data , 2011, Nucleic acids research.

[26]  José Crossa,et al.  Semi-parametric genomic-enabled prediction of genetic values using reproducing kernel Hilbert spaces methods. , 2010, Genetics research.

[27]  H. Fu,et al.  Intraspecific violation of genetic colinearity and its implications in maize , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[28]  José Crossa,et al.  Genomic Prediction of Breeding Values when Modeling Genotype × Environment Interaction using Pedigree and Dense Molecular Markers , 2012 .

[29]  C. Robin Buell,et al.  Marker Density and Read Depth for Genotyping Populations Using Genotyping-by-Sequencing , 2013, Genetics.

[30]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[31]  José Crossa,et al.  Prediction of Genetic Values of Quantitative Traits in Plant Breeding Using Pedigree and Molecular Markers , 2010, Genetics.

[32]  Peter J. Bradbury,et al.  Maize HapMap2 identifies extant variation from a genome in flux , 2012, Nature Genetics.

[33]  Andreas Prlic,et al.  Sequence analysis , 2003 .

[34]  P M VanRaden,et al.  Genomic measures of relationship and inbreeding , 2007 .

[35]  Jean-Luc Jannink,et al.  Factors Affecting Accuracy From Genomic Selection in Populations Derived From Multiple Inbred Lines: A Barley Case Study , 2009, Genetics.

[36]  M. Calus,et al.  Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding , 2013, Genetics.

[37]  J Crossa,et al.  Genomic prediction in CIMMYT maize and wheat breeding programs , 2013, Heredity.