Integrating Crop Growth Models with Whole Genome Prediction through Approximate Bayesian Computation

Genomic selection, enabled by whole genome prediction (WGP) methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (G×E), continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs) attempt to represent the impact of functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of G×E and certain types of non-additive gene effects on the expressed phenotype. Approximate Bayesian computation (ABC), a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with synthetic data sets. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising and novel approach to improving prediction accuracy for some of the most challenging scenarios in plant breeding and applied genetics.

[1]  J. Passioura,et al.  Roots and drought resistance , 1983 .

[2]  C. Messina,et al.  Yield-trait performance landscapes: from theory to application in breeding maize for drought tolerance. , 2011, Journal of experimental botany.

[3]  J. Goudriaan,et al.  ON APPROACHES AND APPLICATIONS OF THE WAGENINGEN CROP MODELS , 2003 .

[4]  Albrecht E. Melchinger,et al.  Genomic Prediction of Northern Corn Leaf Blight Resistance in Maize with Combined or Separated Training Sets for Heterotic Groups , 2013, G3: Genes | Genomes | Genetics.

[5]  Greg McLean,et al.  Short-term responses of leaf growth rate to water deficit scale up to whole-plant and crop levels: an integrated modelling approach in maize. , 2008, Plant, cell & environment.

[6]  P. Donnelly,et al.  Inferring coalescence times from DNA sequence data. , 1997, Genetics.

[7]  A. Melchinger,et al.  Maximizing the Reliability of Genomic Selection by Optimizing the Calibration Set of Reference Individuals: Comparison of Methods in Two Diverse Groups of Maize Inbreds (Zea mays L.) , 2012, Genetics.

[8]  R. M. Feldman,et al.  Foundations of stochastic development. , 1978, Journal of theoretical biology.

[9]  Achim Walter,et al.  Remote, aerial phenotyping of maize traits with a mobile multi-sensor approach , 2015, Plant Methods.

[10]  Xinyou Yin,et al.  Role of crop physiology in predicting gene-to-phenotype relationships. , 2004, Trends in plant science.

[11]  Martin J. Kropff,et al.  A model analysis of yield differences among recombinant inbred lines in barley , 2000 .

[12]  M. Goddard,et al.  Accurate Prediction of Genetic Values for Complex Traits by Whole-Genome Resequencing , 2010, Genetics.

[13]  R. W. Allard,et al.  Implications of Genotype‐Environmental Interactions in Applied Plant Breeding1 , 1964 .

[14]  Greg McLean,et al.  Adapting APSIM to model the physiology and genetics of complex adaptive traits in field crops. , 2010, Journal of experimental botany.

[15]  R. C. Muchow Effect of nitrogen supply on the comparative productivity of maize and sorghum in a semi-arid tropical environment III. Grain yield and nitrogen accumulation , 1988 .

[16]  M. Stitt,et al.  Genome-wide association mapping of leaf metabolic profiles for dissecting complex traits in maize , 2012, Proceedings of the National Academy of Sciences.

[17]  Deniz Akdemir,et al.  Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions , 2013, Theoretical and Applied Genetics.

[18]  Emily Combs,et al.  Accuracy of Genomewide Selection for Different Traits with Constant Population Size, Heritability, and Number of Markers , 2013 .

[19]  Jose Crossa,et al.  Effectiveness of Genomic Prediction of Maize Hybrid Performance in Different Breeding Populations and Environments , 2012, G3: Genes | Genomes | Genetics.

[20]  Sarah Filippi,et al.  A framework for parameter estimation and model selection from experimental data in systems biology using approximate Bayesian computation , 2014, Nature Protocols.

[21]  Xiaoming Bao,et al.  Transgenic alteration of ethylene biosynthesis increases grain yield in maize under field drought-stress conditions. , 2014, Plant biotechnology journal.

[22]  A. Estoup,et al.  Ecological genetics of invasive alien species , 2011, BioControl.

[23]  Eleftherios Pilalis,et al.  An in silico compartmentalized metabolic model of Brassica napus enables the systemic study of regulatory aspects of plant central metabolism , 2011, Biotechnology and bioengineering.

[24]  Graeme L. Hammer,et al.  Evaluating Plant Breeding Strategies by Simulating Gene Action and Dryland Environment Effects , 2003, Agronomy Journal.

[25]  Gareth W. Peters,et al.  On sequential Monte Carlo, partial rejection control and approximate Bayesian computation , 2008, Statistics and Computing.

[26]  D. Gianola Priors in Whole-Genome Regression: The Bayesian Alphabet Returns , 2013, Genetics.

[27]  Neil C. Turner,et al.  Water stress and redlegged earth mites affect the early growth of seedlings in a subterranean clover/capeweed pasture community , 2000 .

[28]  Walid Sadok,et al.  Linking physiological and genetic analyses of the control of leaf growth under changing environmental conditions , 2005 .

[29]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[30]  Erika Cule,et al.  ABC-SysBio—approximate Bayesian computation in Python with GPU support , 2010, Bioinform..

[31]  C. Messina,et al.  A Gene Regulatory Network Model for Floral Transition of the Shoot Apex in Maize and Its Dynamic Modeling , 2012, PloS one.

[32]  Chris Murphy,et al.  APSIM - Evolution towards a new generation of agricultural systems simulation , 2014, Environ. Model. Softw..

[33]  M. Beaumont,et al.  ABC: a useful Bayesian tool for the analysis of population data. , 2010, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[34]  Graeme L. Hammer,et al.  Can Changes in Canopy and/or Root System Architecture Explain Historical Maize Yield Trends in the U.S. Corn Belt? , 2009 .

[35]  James W. Jones,et al.  A Gene‐Based Model to Simulate Soybean Development and Yield Responses to Environment , 2006 .

[36]  Graeme L. Hammer,et al.  Genotype by environment interactions affecting grain sorghum. II. Frequencies of different seasonal patterns of drought stress are related to location effects on hybrid yields. , 2000 .

[37]  R. C. Muchow,et al.  Effect of nitrogen supply on the comparative productivity of maize and sorghum in a semi-arid tropical environment II. Radiation interception and biomass accumulation , 1988 .

[38]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[39]  Anne Elings,et al.  Estimation of leaf area in tropical maize , 2000 .

[40]  P. Marjoram,et al.  Post-GWAS: where next? More samples, more SNPs or more biology? , 2013, Heredity.

[41]  Jianwei Lu,et al.  Evaluation of genome-wide selection efficiency in maize nested association mapping populations , 2011, Theoretical and Applied Genetics.

[42]  J. Dudley,et al.  Evolution of North American Dent Corn from Public to Proprietary Germplasm , 2006 .

[43]  Senthold Asseng,et al.  An overview of APSIM, a model designed for farming systems simulation , 2003 .

[44]  Xiaochun Sun,et al.  Nonparametric Method for Genomics-Based Prediction of Performance of Quantitative Traits Involving Epistasis in Plant Breeding , 2012, PloS one.

[45]  Alain Charcosset,et al.  Combining Quantitative Trait Loci Analysis and an Ecophysiological Model to Analyze the Genetic Variability of the Responses of Maize Leaf Growth to Temperature and Water Deficit1 , 2003, Plant Physiology.

[46]  S. Grando,et al.  Genotype x environment interaction of crossover type: detecting its presence and estimating the crossover point , 1999, Theoretical and Applied Genetics.

[47]  Albrecht E. Melchinger,et al.  Genomic prediction of dichotomous traits with Bayesian logistic models , 2013, Theoretical and Applied Genetics.

[48]  R. F. Dale,et al.  A Trend Toward a Longer Grain‐Filling Period for Corn: A Case Study in Indiana1 , 1984 .

[49]  Daniel Gianola,et al.  Using Whole-Genome Sequence Data to Predict Quantitative Trait Phenotypes in Drosophila melanogaster , 2012, PLoS genetics.

[50]  Ky L. Mathews,et al.  Evaluation of genomic selection training population designs and genotyping strategies in plant breeding programs using simulation , 2014 .

[51]  B. Walsh,et al.  Models for navigating biological complexity in breeding improved crop plants. , 2006, Trends in plant science.

[52]  O. François,et al.  Approximate Bayesian Computation (ABC) in practice. , 2010, Trends in ecology & evolution.

[53]  Lakshmi Sobhana Kalli,et al.  Market-Oriented Cloud Computing : Vision , Hype , and Reality for Delivering IT Services as Computing , 2013 .

[54]  Jeffrey B. Endelman,et al.  Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP , 2011 .

[55]  A. Carriquiry,et al.  Parametric and Nonparametric Statistical Methods for Genomic Selection of Traits with Additive and Epistatic Genetic Architectures , 2014, G3: Genes, Genomes, Genetics.

[56]  X. Draye,et al.  Root system architecture: opportunities and constraints for genetic improvement of crops. , 2007, Trends in plant science.

[57]  R. C. Muchow,et al.  Effect of high temperature on grain-growth in field-grown maize. , 1990 .

[58]  Michael Renton,et al.  How much detail and accuracy is required in plant growth sub-models to address questions about optimal management strategies in agricultural systems? , 2011, AoB PLANTS.

[59]  D. Fell,et al.  A general definition of metabolic pathways useful for systematic organization and analysis of complex metabolic networks , 2000, Nature Biotechnology.

[60]  Graeme L. Hammer,et al.  The GP problem: Quantifying gene-to-phenotype relationships , 2002, Silico Biol..

[61]  Frank Technow,et al.  R Package hypred : Simulation of Genomic Data in Applied Genetics , 2011 .

[62]  A. Gelfand,et al.  Identifiability, Improper Priors, and Gibbs Sampling for Generalized Linear Models , 1999 .

[63]  A. Fernie,et al.  Metabolomics-assisted breeding: a viable option for crop improvement? , 2009, Trends in genetics : TIG.

[64]  D. Grattapaglia,et al.  Accelerating the domestication of trees using genomic selection: accuracy of prediction models across ages and environments. , 2012, The New phytologist.

[65]  R. C. Muchow,et al.  Temperature and solar radiation effects on potential maize yield across locations. , 1990 .

[66]  Jianjun Tang,et al.  Model analysis of flowering phenology in recombinant inbred lines of barley. , 2005, Journal of experimental botany.

[67]  F D Richey,et al.  MOCK-DOMINANCE AND HYBRID VIGOR. , 1942, Science.

[68]  C. Messina,et al.  Breeding drought-tolerant maize hybrids for the US corn-belt: discovery to product. , 2014, Journal of experimental botany.

[69]  Enli Wang,et al.  Using systems modelling to explore the potential for root exudates to increase phosphorus use efficiency in cereal crops , 2013, Environ. Model. Softw..

[70]  Keith E. Duncan,et al.  Maize ARGOS1 (ZAR1) transgenic alleles increase hybrid maize yield , 2013, Journal of experimental botany.

[71]  F. V. van Eeuwijk,et al.  QTL analysis and QTL-based prediction of flowering phenology in recombinant inbred lines of barley. , 2005, Journal of experimental botany.

[72]  François Brun,et al.  Assessing the Uncertainty when Using a Model to Compare Irrigation Strategies , 2012 .

[73]  J. T. Eta-Ndu,et al.  Epistasis for Grain Yield in Two F2 Populations of Maize , 1999, Crop Science.

[74]  C. Maranas,et al.  Zea mays iRS1563: A Comprehensive Genome-Scale Metabolic Reconstruction of Maize Metabolism , 2011, PloS one.

[75]  M Erbe,et al.  Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. , 2012, Journal of dairy science.

[76]  L. Totir,et al.  Predicting the future of plant breeding: complementing empirical evaluation with genetic prediction , 2014, Crop and Pasture Science.

[77]  V. Allard,et al.  Predictions of heading date in bread wheat (Triticum aestivum L.) using QTL-based parameters of an ecophysiological model , 2014, Journal of experimental botany.

[78]  H. F. Utz,et al.  Heterosis and gene effects of multiplicative characters: theoretical relationships and experimental results from Vicia faba L. , 1994, Theoretical and Applied Genetics.

[79]  B. Maher Personal genomes: The case of the missing heritability , 2008, Nature.

[80]  Hans-Peter Piepho,et al.  Genomic selection allowing for marker‐by‐environment interaction , 2013 .

[81]  Rafael A. Cañas,et al.  Nitrogen-use efficiency in maize (Zea mays L.): from 'omics' studies to metabolic modelling. , 2014, Journal of experimental botany.

[82]  Mark M. Tanaka,et al.  Sequential Monte Carlo without likelihoods , 2007, Proceedings of the National Academy of Sciences.

[83]  Christian P. Robert,et al.  The Bayesian choice : from decision-theoretic foundations to computational implementation , 2007 .

[84]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[85]  J. Araus,et al.  Field high-throughput phenotyping: the new crop breeding frontier. , 2014, Trends in plant science.

[86]  P. Bickel,et al.  Curse-of-dimensionality revisited: Collapse of the particle filter in very large scale systems , 2008, 0805.3034.

[87]  Gustavo A. Slafer,et al.  Genetic basis of yield as viewed from a crop physiologist's perspective , 2003 .

[88]  Hsiao-Pei Yang,et al.  Genomic Selection in Plant Breeding: A Comparison of Models , 2012 .

[89]  J Crossa,et al.  Genomic prediction in biparental tropical maize populations in water-stressed and well-watered environments using low-density and GBS SNPs , 2014, Heredity.

[90]  M. Cooper,et al.  Relationships among analytical methods used to study genotypic variation and genotype-by-environment interaction in plant breeding multi-environment experiments , 1994, Theoretical and Applied Genetics.

[91]  Peter R. Thomison,et al.  Delayed Planting Effects on Flowering and Grain Maturation of Dent Corn , 2002 .

[92]  R. J. Lambert,et al.  Inbreeding Depression, Inbred and Hybrid Grain Yields, and Other Traits of Maize Genotypes Representing Three Eras1 , 1984 .

[93]  R. Fernando,et al.  Genomic BLUP Decoded: A Look into the Black Box of Genomic Prediction , 2013, Genetics.

[94]  R. Tempelman,et al.  A Bayesian Antedependence Model for Whole Genome Prediction , 2012, Genetics.

[95]  José Crossa,et al.  Genomic Prediction of Breeding Values when Modeling Genotype × Environment Interaction using Pedigree and Dense Molecular Markers , 2012 .

[96]  F. Tardieu,et al.  Are source and sink strengths genetically linked in maize plants subjected to water deficit? A QTL study of the responses of leaf growth and of Anthesis-Silking Interval to water deficit. , 2006, Journal of experimental botany.

[97]  Mark E. Cooper,et al.  Modelling Crop Improvement in a G×E×M Framework via Gene–Trait–Phenotype Relationships , 2009 .

[98]  Growing access to phenotype data , 2015, Nature Genetics.

[99]  Arnel R. Hallauer,et al.  Triple testcross analysis to detect epistasis in maize , 1997 .

[100]  G. Hammer,et al.  Simulating the Yield Impacts of Organ-Level Quantitative Trait Loci Associated With Drought Response in Maize: A “Gene-to-Phenotype” Modeling Approach , 2009, Genetics.

[101]  J. Gordon Burleigh,et al.  Assessing Parameter Identifiability in Phylogenetic Models Using Data Cloning , 2012, Systematic biology.

[102]  Mikko J. Sillanpää,et al.  Back to Basics for Bayesian Model Building in Genomic Selection , 2012, Genetics.

[103]  James B. Holland,et al.  Epistasis and Plant Breeding , 2010 .

[104]  Rohan L. Fernando,et al.  Extension of the bayesian alphabet for genomic selection , 2011, BMC Bioinformatics.

[105]  Albrecht E. Melchinger,et al.  High-throughput non-destructive biomass determination during early plant development in maize under field conditions , 2011 .

[106]  J. Keurentjes Genetical metabolomics: closing in on phenotypes. , 2009, Current opinion in plant biology.

[107]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[108]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[109]  G. Hammer,et al.  Modeling QTL for complex traits: detection and context for plant breeding. , 2009, Current opinion in plant biology.

[110]  Steve Langton,et al.  Classification of maize environments using crop simulation and geographic information systems , 2005 .

[111]  José Crossa,et al.  A reaction norm model for genomic selection using high-dimensional genomic and environmental data , 2013, Theoretical and Applied Genetics.

[112]  M. Stitt,et al.  Genomic and metabolic prediction of complex heterotic traits in hybrid maize , 2012, Nature Genetics.

[113]  J. Vrugt,et al.  Approximate Bayesian Computation using Markov Chain Monte Carlo simulation: DREAM(ABC) , 2014 .

[114]  Guosheng Su,et al.  Genomic evaluation of cattle in a multi-breed context ☆ , 2014 .

[115]  Shizhong Xu,et al.  An Empirical Bayes Method for Estimating Epistatic Effects of Quantitative Trait Loci , 2007, Biometrics.

[116]  J. Woolliams,et al.  The Impact of Genetic Architecture on Genome-Wide Evaluation Methods , 2010, Genetics.