Modelling G×E with historical weather information improves genomic prediction in new environments

Abstract Motivation Interaction between the genotype and the environment (G×E) has a strong impact on the yield of major crop plants. Although influential, taking G×E explicitly into account in plant breeding has remained difficult. Recently G×E has been predicted from environmental and genomic covariates, but existing works have not shown that generalization to new environments and years without access to in-season data is possible and practical applicability remains unclear. Using data from a Barley breeding programme in Finland, we construct an in silico experiment to study the viability of G×E prediction under practical constraints. Results We show that the response to the environment of a new generation of untested Barley cultivars can be predicted in new locations and years using genomic data, machine learning and historical weather observations for the new locations. Our results highlight the need for models of G×E: non-linear effects clearly dominate linear ones, and the interaction between the soil type and daily rain is identified as the main driver for G×E for Barley in Finland. Our study implies that genomic selection can be used to capture the yield potential in G×E effects for future growth seasons, providing a possible means to achieve yield improvements, needed for feeding the growing population. Availability and implementation The data accompanied by the method code (http://research.cs.aalto.fi/pml/software/gxe/bioinformatics_codes.zip) is available in the form of kernels to allow reproducing the results. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[2]  P. Langridge,et al.  Breeding Technologies to Increase Crop Production in a Changing World , 2010, Science.

[3]  Jose Crossa,et al.  Increasing Genomic‐Enabled Prediction Accuracy by Modeling Genotype × Environment Interactions in Kansas Wheat , 2017, The plant genome.

[4]  Matti Pirinen,et al.  Multiple Output Regression with Latent Noise , 2014, J. Mach. Learn. Res..

[5]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[6]  José Crossa,et al.  Genome-enabled Prediction of Complex Traits with Kernel Methods: What Have We Learned? , 2014 .

[7]  Nci Dream Community A community effort to assess and improve drug sensitivity prediction algorithms , 2014 .

[8]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[9]  José Crossa,et al.  A reaction norm model for genomic selection using high-dimensional genomic and environmental data , 2013, Theoretical and Applied Genetics.

[10]  Jeffrey B. Endelman,et al.  Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP , 2011 .

[11]  Hans-Joachim Braun,et al.  CIMMYT's approach to breeding for wide adaptation , 2004, Euphytica.

[12]  Francis P. Shepard,et al.  Nomenclature Based on Sand-silt-clay Ratios , 1954 .

[13]  David M. Pennock,et al.  Categories and Subject Descriptors , 2001 .

[14]  Deniz Akdemir,et al.  Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions , 2013, Theoretical and Applied Genetics.

[15]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[16]  Marco Lopez-Cruz,et al.  Increased Prediction Accuracy in Wheat Breeding Trials Using a Marker × Environment Interaction Genomic Selection Model , 2015, G3: Genes, Genomes, Genetics.

[17]  Fred A. van Eeuwijk,et al.  Predicting responses in multiple environments : Issues in relation to genotype × Environment interactions , 2016 .

[18]  Samuel Kaski,et al.  Kernelized Bayesian Matrix Factorization , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  A. Brix Bayesian Data Analysis, 2nd edn , 2005 .

[20]  Hans-Peter Piepho,et al.  Genome-based prediction of maize hybrid performance across genetic groups, testers, locations, and years , 2014, Theoretical and Applied Genetics.

[21]  Ky L. Mathews,et al.  Genomic prediction models for grain yield of spring bread wheat in diverse agro-ecological zones , 2016, Scientific Reports.

[22]  Mehmet Gönen,et al.  Bayesian Efficient Multiple Kernel Learning , 2012, ICML.

[23]  José Crossa,et al.  Genomic Prediction of Genotype × Environment Interaction Kernel Regression Models , 2016, The plant genome.

[24]  M. Calus,et al.  Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding , 2013, Genetics.

[25]  José Crossa,et al.  Genomic Prediction of Breeding Values when Modeling Genotype × Environment Interaction using Pedigree and Dense Molecular Markers , 2012 .

[26]  Laura M. Heiser,et al.  A community effort to assess and improve drug sensitivity prediction algorithms , 2014, Nature Biotechnology.

[27]  Osval A. Montesinos-López,et al.  A Genomic Bayesian Multi-trait and Multi-environment Model , 2016, G3: Genes, Genomes, Genetics.

[28]  Salvador A. Gezan,et al.  Estimating Genotype × Environment Interaction for and Genetic Correlations among Drought Tolerance Traits in Maize via Factor Analytic Multiplicative Mixed Models , 2018 .