Robust modeling of additive and nonadditive variation with intuitive inclusion of expert knowledge

We propose a novel Bayesian approach that robustifies genomic modelling by leveraging expert knowledge through prior distributions. The central component is the hierarchical decomposition of phenotypic variation into additive and non-additive genetic variation, which leads to an intuitive model parameterization that can be visualised as a tree. The edges of the tree represent ratios of variances, for example broad-sense heritability, which are quantities for which expert knowledge is natural to exist. Penalized complexity priors are defined for all edges of the tree in a bottom-up procedure that respects the model structure and incorporates expert knowledge through all levels. We investigate models with different sources of variation and compare the performance of different priors implementing varying amounts of expert knowledge in the context of plant breeding. A simulation study shows that the proposed priors implementing expert knowledge improve the robustness of genomic modelling and the selection of the genetically best individuals in a breeding program. We observe this improvement in both variety selection on genetic values and parent selection on additive values; the variety selection benefited the most. In a real case study expert knowledge increases phenotype prediction accuracy for cases in which the standard maximum likelihood approach did not find optimal estimates for the variance components. Finally, we discuss the importance of expert knowledge priors for genomic modelling and breeding, and point to future research areas of easy-to-use and parsimonious priors in genomic modelling.

[1]  Alexander I. Young Solving the missing heritability problem , 2019, PLoS genetics.

[2]  G. Acquaah Principles of plant genetics and breeding , 2006 .

[3]  J. Foley,et al.  Yield Trends Are Insufficient to Double Global Crop Production by 2050 , 2013, PloS one.

[4]  Michael Betancourt,et al.  Diagnosing Suboptimal Cotangent Disintegrations in Hamiltonian Monte Carlo , 2016, 1604.00695.

[5]  C. Whitelaw,et al.  A Strategy To Exploit Surrogate Sire Technology in Livestock Breeding Programs , 2017, G3: Genes, Genomes, Genetics.

[6]  Haavard Rue,et al.  Bayesian Computing with INLA: A Review , 2016, 1604.00860.

[7]  D. Gianola,et al.  Genomic Heritability: What Is It? , 2014, PLoS genetics.

[8]  Haavard Rue,et al.  Intuitive Joint Priors for Variance Parameters , 2019, Bayesian Analysis.

[9]  L. Varona,et al.  Orthogonal Estimates of Variances for Additive, Dominance, and Epistatic Effects in Populations , 2017, Genetics.

[10]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[11]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[12]  Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data) , 2019, G3: Genes, Genomes, Genetics.

[13]  K. Meyer Simple Penalties on Maximum-Likelihood Estimates of Genetic Parameters to Reduce Sampling Variation , 2015, Genetics.

[14]  C. Ballantine On the Hadamard product , 1968 .

[15]  P. Müller,et al.  Determining the Effective Sample Size of a Parametric Prior , 2008, Biometrics.

[16]  K. Meyer "Bending" and beyond: Better estimates of quantitative genetic parameters? , 2019, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[17]  Daniel Gianola,et al.  Inferring genetic values for quantitative traits non-parametrically. , 2008, Genetics research.

[18]  G. de los Campos,et al.  Complex-Trait Prediction in the Era of Big Data. , 2018, Trends in genetics : TIG.

[19]  Ky L. Mathews,et al.  Genomic selection in multi-environment plant breeding trials using a factor analytic linear mixed model. , 2019, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[20]  L. Penrose,et al.  THE CORRELATION BETWEEN RELATIVES ON THE SUPPOSITION OF MENDELIAN INHERITANCE , 2022 .

[21]  J. Martini,et al.  On the approximation of interaction effect models by Hadamard powers of the additive genomic relationship. , 2020, Theoretical population biology.

[22]  Thiago G. Martins,et al.  Penalising Model Component Complexity: A Principled, Practical Approach to Constructing Priors , 2014, 1403.4630.

[23]  Jason H. Moore,et al.  Why epistasis is important for tackling complex human disease genetics , 2014, Genome Medicine.

[24]  M. Calus,et al.  Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding , 2013, Genetics.

[25]  Yusheng Zhao,et al.  Genome-based establishment of a high-yielding heterotic pattern for hybrid wheat breeding , 2015, Proceedings of the National Academy of Sciences.

[26]  L. Varona,et al.  Non-additive Effects in Genomic Selection , 2018, Front. Genet..

[27]  Ning Gao,et al.  Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE) , 2017, BMC Bioinformatics.

[28]  W. G. Hill,et al.  Dominance genetic variation contributes little to the missing heritability for human complex traits. , 2015, American journal of human genetics.

[29]  H. Rue,et al.  Bayesian bivariate meta‐analysis of diagnostic test studies with interpretable priors , 2015, Statistics in medicine.

[30]  W. G. Hill,et al.  Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits , 2008, PLoS genetics.

[31]  Gustavo de Los Campos,et al.  Unraveling Additive from Nonadditive Effects Using Genomic Relationship Matrices , 2014, Genetics.

[32]  Andres Legarra,et al.  Comparing estimates of genetic variance across different relationship models. , 2016, Theoretical population biology.

[33]  T. Mackay,et al.  The Genetic Architecture of Quantitative Traits Cannot Be Inferred from Variance Component Analysis , 2016, bioRxiv.

[34]  Rabia Begum A decade of Genome Medicine: toward precision medicine , 2019, Genome Medicine.

[35]  Daniel Gianola,et al.  Kernel-based variance component estimation and whole-genome prediction of pre-corrected phenotypes and progeny tests for dairy cow health traits , 2014, Front. Genet..

[36]  Gregor Gorjanc,et al.  AlphaSim: Software for Breeding Program Simulation , 2016, The plant genome.

[37]  Daniel Gianola,et al.  Bayesian Methods in Animal Breeding Theory , 1986 .

[38]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[39]  Jin-jie Sun,et al.  Genomic Selection in Wheat , 2019, Applications of Genetic and Genomic Research in Cereals.

[40]  Kevin Fiedler,et al.  Likelihood Bayesian And Mcmc Methods In Quantitative Genetics , 2016 .

[41]  Helena Oakey,et al.  Joint modeling of additive and non-additive genetic line effects in single field trials , 2006, Theoretical and Applied Genetics.

[42]  J. Poland,et al.  Genomic Selection in Preliminary Yield Trials in a Winter Wheat Breeding Program , 2018, G3: Genes, Genomes, Genetics.

[43]  IN BR Ief THE STATE OF FOOD SECURITY AND NUTRITION IN THE WORLD BUILDING RESILIENCE FOR PEACE AND FOOD SECURITY , 2017 .

[44]  J. Reif,et al.  Mapping QTLs with main and epistatic effects underlying grain yield and heading time in soft winter wheat , 2011, Theoretical and Applied Genetics.

[45]  Jeffrey W. White,et al.  Rising Temperatures Reduce Global Wheat Production , 2015 .

[46]  H. Rue,et al.  Fractional Gaussian noise: Prior specification and model comparison , 2016, 1611.06399.

[47]  M. Kirkpatrick,et al.  Penalized maximum likelihood estimates of genetic covariance matrices with shrinkage towards phenotypic dispersion , 2011 .

[48]  Daniel Gianola,et al.  Additive Genetic Variability and the Bayesian Alphabet , 2009, Genetics.

[49]  D. Cros,et al.  Modeling additive and non-additive effects in a hybrid population using genome-wide genotyping: prediction accuracy implications , 2015, Heredity.

[50]  M. Kirst,et al.  Genomic Prediction of Additive and Non-additive Effects Using Genetic Markers and Pedigrees , 2019, G3: Genes, Genomes, Genetics.

[51]  H. Grüneberg,et al.  Introduction to quantitative genetics , 1960 .

[52]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[53]  D. Houle Comparing evolvability and variability of quantitative traits. , 1992, Genetics.

[54]  N. Reinsch,et al.  Including non-additive genetic effects in Bayesian methods for the prediction of genetic values based on genome-wide markers , 2011, BMC Genetics.

[55]  Daniel Gianola,et al.  Kernel-based whole-genome prediction of complex traits: a review , 2014, Front. Genet..

[56]  P. Visscher,et al.  10 Years of GWAS Discovery: Biology, Function, and Translation. , 2017, American journal of human genetics.

[57]  A. Bentley,et al.  A Two‐Part Strategy for Using Genomic Selection to Develop Inbred Lines , 2017 .

[58]  Timothy B Sackton,et al.  Genotypic Context and Epistasis in Individuals and Populations , 2016, Cell.

[59]  Daniel Gianola,et al.  Predicting genetic predisposition in humans: the promise of whole-genome markers , 2010, Nature Reviews Genetics.

[60]  W. Ewens Genetics and analysis of quantitative traits , 1999 .

[61]  P. Shewry,et al.  The contribution of wheat to human diet and health , 2015, Food and energy security.

[62]  Trevor W. Rife,et al.  A Field‐Based Analysis of Genetic Improvement for Grain Yield in Winter Wheat Cultivars Developed in the US Central Plains from 1992 to 2014 , 2019, Crop Science.

[63]  W. G. Hill,et al.  Expected influence of linkage disequilibrium on genetic variance caused by dominance and epistasis on quantitative traits. , 2015, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[64]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[65]  C. R. Henderson Best Linear Unbiased Prediction of Nonadditive Genetic Merits in Noninbred Populations , 1985 .

[66]  D. Gianola,et al.  Contribution of an additive locus to genetic variance when inheritance is multi-factorial with implications on interpretation of GWAS , 2013, TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik.

[67]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[68]  I. Steinsland,et al.  Flexible modelling of spatial variation in agricultural field trials with the R package INLA , 2019, Theoretical and Applied Genetics.

[69]  M. Mette,et al.  Relatedness severely impacts accuracy of marker-assisted selection for disease resistance in hybrid wheat , 2013, Heredity.

[70]  J. Woolliams,et al.  Genomic dissection of maternal, additive and non-additive genetic effects for growth and carcass traits in Nile tilapia , 2019, bioRxiv.

[71]  Ignacy Misztal,et al.  Estimation of Variance Components with Large-Scale Dominance Models , 1997 .

[72]  Andrea Riebler,et al.  An intuitive Bayesian spatial model for disease mapping that accounts for scaling , 2016, Statistical methods in medical research.

[73]  M. Sorrells,et al.  Prediction of Subgenome Additive and Interaction Effects in Allohexaploid Wheat , 2018, G3: Genes, Genomes, Genetics.

[74]  Aki Vehtari,et al.  Hamiltonian Monte Carlo using an adjoint-differentiated Laplace approximation: Bayesian inference for latent Gaussian models and beyond , 2020, NeurIPS.

[75]  F. Schenkel,et al.  Estimation of additive and non-additive genetic effects for fertility and reproduction traits in North American Holstein cattle using genomic information. , 2020, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[76]  J. Garner,et al.  Reanalyses of the historical series of UK variety trials to quantify the contributions of genetic and environmental factors to trends and variability in yield over time , 2010, Theoretical and Applied Genetics.

[77]  W. G. Hill,et al.  Influence of Gene Interaction on Complex Trait Variation with Multilocus Models , 2014, Genetics.

[78]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .