Weighting Strategies for Single-Step Genomic BLUP: An Iterative Approach for Accurate Calculation of GEBV and GWAS

Genomic Best Linear Unbiased Predictor (GBLUP) assumes equal variance for all single nucleotide polymorphisms (SNP). When traits are influenced by major SNP, Bayesian methods have the advantage of SNP selection. To overcome the limitation of GBLUP, unequal variance or weights for all SNP are applied in a method called weighted GBLUP (WGBLUP). If only a fraction of animals is genotyped, single-step WGBLUP (WssGBLUP) can be used. Default weights in WGBLUP or WssGBLUP are obtained iteratively based on single SNP effect squared (u2) and/or heterozygosity. When the weights are optimal, prediction accuracy, and ability to detect major SNP are maximized. The objective was to develop optimal weights for WGBLUP-based methods. We evaluated 5 new procedures that accounted for locus-specific or windows-specific variance to maximize accuracy of predicting genomic estimated breeding value (GEBV) and SNP effect. Simulated datasets consisted of phenotypes for 13,000 animals, including 1540 animals genotyped for 45,000 SNP. Scenarios with 5, 100, and 500 simulated quantitative trait loci (QTL) were considered. The 5 new procedures for SNP weighting were: (1) u2 plus a constant equal to the weight of the top SNP; (2) from a heavy-tailed distribution (similar to BayesA); (3) for every 20 SNP in a window along the whole genome, the largest effect (u2) among them; (4) the mean effect of every 20 SNP; and (5) the summation of every 20 SNP. Those methods were compared to the default WssGBLUP, GBLUP, BayesB, and BayesC. WssGBLUP methods were evaluated over 10 iterations. The accuracy of predicting GEBV was the correlation between true and estimated genomic breeding values for 300 genotyped animals from the last generation. The ability to detect the simulated QTL was also investigated. For most of the QTL scenarios, the accuracies obtained with all WssGBLUP procedures were higher compared to those from BayesB and BayesC, partly due to automatic inclusion of parent average in the former. Manhattan plots had higher resolution with 5 and 100 QTL. Using a common weight for a window of 20 SNP that sums or averages the SNP variance enhances accuracy of predicting GEBV and provides accurate estimation of marker effects.

[1]  Ignacy Misztal,et al.  BLUPF90 and related programs (BGF90) , 2002 .

[2]  M. Calus,et al.  Across population genomic prediction scenarios in which Bayesian variable selection outperforms GBLUP , 2015, BMC Genetics.

[3]  Shizhong Xu,et al.  Genetic Mapping and Genomic Selection Using Recombination Breakpoint Data , 2013, Genetics.

[4]  D. Garrick,et al.  Technical note: Derivation of equivalent computing algorithms for genomic predictions and reliabilities of animal merit. , 2009, Journal of dairy science.

[5]  I Misztal,et al.  Hot topic: a unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score. , 2010, Journal of dairy science.

[6]  M. Goddard,et al.  Invited review: Genomic selection in dairy cattle: progress and challenges. , 2009, Journal of dairy science.

[7]  P. VanRaden,et al.  Distribution and location of genetic effects for dairy traits. , 2009, Journal of dairy science.

[8]  I Misztal,et al.  Genome-wide marker-assisted selection combining all pedigree phenotypic information with genotypic data in one step: An example using broiler chickens. , 2011, Journal of animal science.

[9]  R. Fernando,et al.  Genomic prediction of simulated multibreed and purebred performance using observed fifty thousand single nucleotide polymorphism genotypes. , 2010, Journal of animal science.

[10]  I Misztal,et al.  Genetic evaluation using single-step genomic best linear unbiased predictor in American Angus. , 2015, Journal of animal science.

[11]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[12]  Rohan L. Fernando,et al.  Extension of the bayesian alphabet for genomic selection , 2011, BMC Bioinformatics.

[13]  W. Muir,et al.  Genome-wide association mapping including phenotypes from relatives without genotypes. , 2012, Genetics research.

[14]  P. VanRaden,et al.  Invited review: reliability of genomic predictions for North American Holstein bulls. , 2009, Journal of dairy science.

[15]  Xiao-Lin Wu,et al.  Modeling relationships between calving traits: a comparison between standard and recursive mixed models , 2010, Genetics Selection Evolution.

[16]  Zhe Zhang,et al.  Improving the Accuracy of Whole Genome Prediction for Complex Traits Using the Results of Genome Wide Association Studies , 2014, PloS one.

[17]  Stacey S. Cherny,et al.  Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets , 2011, Human Genetics.

[18]  M. Goddard,et al.  Mapping multiple QTL using linkage disequilibrium and linkage analysis information and multitrait data , 2004, Genetics Selection Evolution.

[19]  Yutaka Masuda,et al.  The Dimensionality of Genomic Information and Its Effect on Genomic Prediction , 2016, Genetics.

[20]  M. Stephens,et al.  Imputation-Based Analysis of Association Studies: Candidate Regions and Quantitative Traits , 2007, PLoS genetics.

[21]  M P L Calus,et al.  Accuracy of breeding values when using and ignoring the polygenic effect in genomic breeding value estimation with a marker density of one SNP per cM. , 2007, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[22]  Ignacy Misztal,et al.  Different genomic relationship matrices for single-step analysis using phenotypic, pedigree and genomic information , 2011, Genetics Selection Evolution.

[23]  Ignacy Misztal,et al.  Single Step, a general approach for genomic selection , 2014 .

[24]  Optimising multistage dairy cattle breeding schemes including genomic selection using decorrelated or optimum selection indices , 2012, Genetics Selection Evolution.

[25]  Jon Wakefield,et al.  A Bayesian measure of the probability of false discovery in genetic epidemiology studies. , 2007, American journal of human genetics.

[26]  R. Fernando,et al.  Deregressing estimated breeding values and weighting information for genomic regression analyses , 2009, Genetics Selection Evolution.

[27]  M. Lund,et al.  Genomic prediction when some animals are not genotyped , 2010, Genetics Selection Evolution.

[28]  M. Lund,et al.  Bmc Proceedings Comparison of Analyses of the Qtlmas Xii Common Dataset. I: Genomic Selection , 2022 .

[29]  P. VanRaden,et al.  Efficient methods to compute genomic predictions. , 2008, Journal of dairy science.

[30]  G. L. Bennett,et al.  Partial-genome evaluation of postweaning feed intake and efficiency of crossbred beef cattle. , 2011, Journal of animal science.

[31]  R. Fernando,et al.  Genome-wide association mapping including phenotypes from relatives without genotypes in a single-step (ssGWAS) for 6-week body weight in broiler chickens , 2014, Front. Genet..

[32]  Jon Wakefield,et al.  Bayes factors for genome‐wide association studies: comparison with P‐values , 2009, Genetic epidemiology.

[33]  Ignacy Misztal,et al.  Accuracy of estimated breeding values with genomic information on males, females, or both: an example on broiler chicken , 2015, Genetics Selection Evolution.

[34]  M. D. de Cara,et al.  Detecting inbreeding depression for reproductive traits in Iberian pigs using genome-wide data , 2015, Genetics Selection Evolution.

[35]  M. Goddard,et al.  Mapping genes for complex traits in domestic animals and their use in breeding programmes , 2009, Nature Reviews Genetics.

[36]  R. Fernando,et al.  Accuracy of prediction of simulated polygenic phenotypes and their underlying quantitative trait loci genotypes using real or imputed whole-genome markers in cattle , 2015, Genetics Selection Evolution.

[37]  M. Goddard,et al.  Genomic selection based on dense genotypes inferred from sparse genotypes. , 2009 .

[38]  M. Lund,et al.  Comparison between genomic predictions using daughter yield deviation and conventional estimated breeding value as response variables. , 2010, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[39]  J. Woolliams,et al.  The Impact of Genetic Architecture on Genome-Wide Evaluation Methods , 2010, Genetics.

[40]  Mehdi Sargolzaei,et al.  QMSim: a large-scale genome simulator for livestock , 2009, Bioinform..

[41]  B. Guldbrandtsen,et al.  Preliminary investigation on reliability of genomic estimated breeding values in the Danish Holstein population. , 2010, Journal of dairy science.

[42]  M. Lund,et al.  Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances. , 2014, Journal of dairy science.

[43]  I Misztal,et al.  A relationship matrix including full pedigree and genomic information. , 2009, Journal of dairy science.

[44]  Guosheng Su,et al.  Comparison on genomic predictions using three GBLUP methods and two single-step blending methods in the Nordic Holstein population , 2012, Genetics Selection Evolution.