Sparse designs for genomic selection using multi-environment data

This research study the genomic-enabled prediction accuracy of the composition of the following sparse testing allocation design: (1) all non-overlapping (0 overlapping) lines in environments, (2) all overlapping (0 non-overlapping) lines tested in all the environments, and (3) combinations of the two previous cases where certain numbers of non-overlapping (NO)/overlapping (O) lines were distributed in the environments. We also studied cases where the size of the testing population was decreased. The study used two large maize data sets (T1 and T2). Four different genomic-enabled prediction models were studied, two models (M1 and M2) that do not include the genomic × environment interaction (GE), whereas models M3 and M4 incorporate two forms of modeling GE. The results show that genome-based models including GE (M3 and M4) captured more genetic variability with the GE component than the other models for both data sets. Also, models M3 and M4 provide higher prediction accuracy than models M1 and M2 for the different allocation designs comprising different combinations of NO/O lines in environments. Results indicate that substantial savings of testing resources can be achieved by optimizing the allocation design using genome-based models including genomic × environment interaction.