Genomic Selection in Plant Breeding

Abstract “Genomic selection,” the ability to select for even complex, quantitative traits based on marker data alone, has arisen from the conjunction of new high-throughput marker technologies and new statistical methods needed to analyze the data. This review surveys what is known about these technologies, with sections on population and quantitative genetic background, DNA marker development, statistical methods, reported accuracies of genomic selection (GS) predictions, prediction of nonadditive genetic effects, prediction in the presence of subpopulation structure, and impacts of GS on long-term gain. GS works by estimating the effects of many loci spread across the genome. Marker and observation numbers therefore need to scale with the genetic map length in Morgans and with the effective population size of the population under GS. For typical crops, the requirements range from at least 200 to at most 10,000 markers and observations. With that baseline, GS can greatly accelerate the breeding cycle while also using marker information to maintain genetic diversity and potentially prolong gain beyond what is possible with phenotypic selection. With the costs of marker technologies continuing to decline and the statistical methods becoming more routine, the results reviewed here suggest that GS will play a large role in the plant breeding of the future. Our summary and interpretation should prove useful to breeders as they assess the value of GS in the context of their populations and resources.

[1]  F. Hospital,et al.  Marker-assisted introgression of quantitative trait loci. , 1997, Genetics.

[2]  J. Whittaker,et al.  Marker-assisted selection using ridge regression. , 1999, Genetical research.

[3]  James B. Holland,et al.  Epistasis and Plant Breeding , 2010 .

[4]  L. Buydens,et al.  Multivariate calibration with least-squares support vector machines. , 2004, Analytical chemistry.

[5]  Keren L. Coxe Principal Components Regression Analysis , 2006 .

[6]  B. De Baets,et al.  Marker-based screening of maize inbred lines using support vector machine regression , 2008, Euphytica.

[7]  R. Fernando,et al.  Genomic-Assisted Prediction of Genetic Value With Semiparametric Procedures , 2006, Genetics.

[8]  P M Visscher,et al.  Strategies to utilize marker-quantitative trait loci associations. , 1998, Journal of dairy science.

[9]  Hanlee P. Ji,et al.  Next-generation DNA sequencing , 2008, Nature Biotechnology.

[10]  T. Close,et al.  An Integrated Resource for Barley Linkage Map and Malting Quality QTL Alignment , 2009 .

[11]  S. Deschamps,et al.  Utilization of next-generation sequencing platforms in plant genomics and genetic variant discovery , 2010, Molecular Breeding.

[12]  A. Melchinger,et al.  Quantitative Trait Locus Mapping Based on Resampling in a Vast Maize Testcross Experiment and Its Relevance to Quantitative Genetics for Complex Traits , 2004, Genetics.

[13]  A. Syvänen Accessing genetic variation: genotyping single nucleotide polymorphisms , 2001, Nature Reviews Genetics.

[14]  R. Fernando,et al.  Genomic prediction of simulated multibreed and purebred performance using observed fifty thousand single nucleotide polymorphism genotypes. , 2010, Journal of animal science.

[15]  R. Bernardo,et al.  Genomewide selection in oil palm: increasing selection gain per unit time and cost with small populations , 2008, Theoretical and Applied Genetics.

[16]  Y. Benjamini,et al.  Quantitative Trait Loci Analysis Using the False Discovery Rate , 2005, Genetics.

[17]  J. Coors Who are Plant Breeders, what do they do, and why? , 2008 .

[18]  G. Charmet,et al.  Marker-assisted recurrent selection for cumulating additive and interactive QTLs in recombinant inbred lines , 1999, Theoretical and Applied Genetics.

[19]  T. Meuwissen,et al.  Accuracy of breeding values of 'unrelated' individuals predicted by dense SNP genotyping , 2009, Genetics Selection Evolution.

[20]  O. Kempthorne,et al.  A Model for the Study of Quantitative Inheritance. , 1954, Genetics.

[21]  M. Sorrells,et al.  Genomic Selection for Crop Improvement , 2009 .

[22]  Carlos D Bustamante,et al.  Ascertainment bias in studies of human genome-wide polymorphism. , 2005, Genome research.

[23]  W. Muir,et al.  Comparison of genomic and traditional BLUP-estimated breeding value accuracy and selection response under alternative trait and genomic parameters. , 2007, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[24]  P. Cregan,et al.  DNA markers for Fusarium head blight resistance QTLs in two wheat populations , 2001, Theoretical and Applied Genetics.

[25]  P. Phillips Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems , 2008, Nature Reviews Genetics.

[26]  E. Lander,et al.  Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. , 1989, Genetics.

[27]  A. Rafalski Applications of single nucleotide polymorphisms in crop genetics. , 2002, Current opinion in plant biology.

[28]  Xuehui Huang,et al.  High-throughput genotyping by whole-genome resequencing. , 2009, Genome research.

[29]  F. Hospital,et al.  Efficient marker-based recurrent selection for multiple quantitative trait loci. , 2000, Genetical research.

[30]  R. Fernando,et al.  Genomic selection of purebreds for crossbred performance , 2009, Genetics Selection Evolution.

[31]  R. Shoemaker,et al.  High-throughput genotyping with the GoldenGate assay in the complex genome of soybean , 2008, Theoretical and Applied Genetics.

[32]  Jean-Luc Jannink,et al.  Genomic selection in plant breeding: from theory to practice. , 2010, Briefings in functional genomics.

[33]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[34]  A. Kilian,et al.  Diversity arrays: a solid state technology for sequence information independent genotyping. , 2001, Nucleic acids research.

[35]  B. Mangin,et al.  Connected populations for detecting quantitative trait loci and testing for epistasis: an application in maize , 2006, Theoretical and Applied Genetics.

[36]  R. Bernardo,et al.  Prospects for genomewide selection for quantitative traits in maize , 2007 .

[37]  J. Jannink,et al.  Using Quantitative Trait Loci Results to Discriminate Among Crosses on the Basis of Their Progeny Mean and Variance , 2007, Genetics.

[38]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[39]  Thomas Altmann,et al.  SNP identification in crop plants. , 2009, Current opinion in plant biology.

[40]  J. Dudley,et al.  Epistatic Models Improve Prediction of Performance in Corn , 2009 .

[41]  Daniel Gianola,et al.  Inferring genetic values for quantitative traits non-parametrically. , 2008, Genetics research.

[42]  Steven B Cannon,et al.  High-throughput SNP discovery through deep resequencing of a reduced representation library to anchor and orient scaffolds in the soybean whole genome sequence , 2010, BMC Genomics.

[43]  John A Woolliams,et al.  A fast algorithm for BayesB type of prediction of genome-wide estimates of genetic value , 2009, Genetics Selection Evolution.

[44]  E. Podolak,et al.  Sequencing's new race. , 2010, BioTechniques.

[45]  B. Guldbrandtsen,et al.  Preliminary investigation on reliability of genomic estimated breeding values in the Danish Holstein population. , 2010, Journal of dairy science.

[46]  F. Hospital,et al.  More on the efficiency of marker-assisted selection , 1997, Theoretical and Applied Genetics.

[47]  Timothy J. Close,et al.  Population Structure and Linkage Disequilibrium in U.S. Barley Germplasm: Implications for Association Mapping , 2010 .

[48]  A. Syvänen Toward genome-wide SNP genotyping , 2005, Nature Genetics.

[49]  Martin S. Taylor,et al.  Genome-wide genetic association of complex traits in heterogeneous stock mice , 2006, Nature Genetics.

[50]  Ben J Hayes,et al.  Accuracy of genomic breeding values in multi-breed dairy cattle populations , 2009, Genetics Selection Evolution.

[51]  R. Fernando,et al.  The Impact of Genetic Relationship Information on Genome-Assisted Breeding Values , 2007, Genetics.

[52]  Chunfang Jin,et al.  Selective Phenotyping for Increased Efficiency in Genetic Mapping Studies , 2004, Genetics.

[53]  Bruce Tier,et al.  A comparison of five methods to predict genomic breeding values of dairy bulls from genome-wide SNP markers , 2009, Genetics Selection Evolution.

[54]  Jan van Oeveren,et al.  Complexity Reduction of Polymorphic Sequences (CRoPS™): A Novel Approach for Large-Scale Polymorphism Discovery in Complex Genomes , 2007, PloS one.

[55]  P. VanRaden,et al.  Invited review: reliability of genomic predictions for North American Holstein bulls. , 2009, Journal of dairy science.

[56]  Jonathan H. Crouch,et al.  Marker-Assisted Selection in Plant Breeding: From Publications to Practice , 2008 .

[57]  M. Goodman,et al.  Improvement of Yield and Ear Number Resulting from Selection at Allozyme Loci in a Maize Population 1 , 1982 .

[58]  Yan Long,et al.  Single nucleotide polymorphism (SNP) discovery in the polyploid Brassica napus using Solexa transcriptome sequencing. , 2009, Plant biotechnology journal.

[59]  T. Meuwissen Maximizing the response of selection with a predefined rate of inbreeding. , 1997, Journal of animal science.

[60]  Daniel Gianola,et al.  Additive Genetic Variability and the Bayesian Alphabet , 2009, Genetics.

[61]  H. Piepho Ridge Regression and Extensions for Genomewide Selection in Maize , 2009 .

[62]  M P L Calus,et al.  Accuracy of breeding values when using and ignoring the polygenic effect in genomic breeding value estimation with a marker density of one SNP per cM. , 2007, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[63]  R. Lande,et al.  Efficiency of marker-assisted selection in the improvement of quantitative traits. , 1990, Genetics.

[64]  J. Woolliams,et al.  Genomic selection using different marker types and densities. , 2008, Journal of animal science.

[65]  J. Sved Correlation measures for linkage disequilibrium within and between populations. , 2009, Genetics research.

[66]  Nengjun Yi,et al.  Hierarchical Generalized Linear Models for Multiple Quantitative Trait Locus Mapping , 2009, Genetics.

[67]  R. Fernando,et al.  Genomic Selection Using Low-Density Marker Panels , 2009, Genetics.

[68]  M. McMullen,et al.  Genetic Properties of the Maize Nested Association Mapping Population , 2009, Science.

[69]  F. Hospital,et al.  Marker-assisted selection efficiency in populations of finite size. , 1998, Genetics.

[70]  B. Baets,et al.  Support vector machine regression for the prediction of maize hybrid performance , 2007, Theoretical and Applied Genetics.

[71]  M. Goddard,et al.  Invited review: Genomic selection in dairy cattle: progress and challenges. , 2009, Journal of dairy science.

[72]  José Crossa,et al.  Predicting Quantitative Traits With Regression Models for Dense Molecular Markers and Pedigree , 2009, Genetics.

[73]  A. Melchinger,et al.  Quantitative trait locus (QTL) mapping using different testers and independent population samples in maize reveals low power of QTL detection and large bias in estimates of QTL effects. , 1998, Genetics.

[74]  L R Schaeffer,et al.  Strategy for applying genome-wide selection in dairy cattle. , 2006, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[75]  S. Jackson,et al.  Next-generation sequencing technologies and their implications for crop genetics and breeding. , 2009, Trends in biotechnology.

[76]  D Bentley,et al.  Highly parallel SNP genotyping. , 2003, Cold Spring Harbor symposia on quantitative biology.

[77]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[78]  J. Dekkers,et al.  Linkage Disequilibrium in Related Breeding Lines of Chickens , 2007, Genetics.

[79]  D. Gianola,et al.  On marker-assisted prediction of genetic value: beyond the ridge. , 2003, Genetics.

[80]  Z. Zeng,et al.  Correcting the bias of Wright's estimates of the number of genes affecting a quantitative character: a further improved method. , 1992, Genetics.

[81]  E. Mardis The impact of next-generation sequencing technology on genetics. , 2008, Trends in genetics : TIG.

[82]  Hans D. Daetwyler,et al.  Accuracy of Predicting the Genetic Risk of Disease Using a Genome-Wide Approach , 2008, PloS one.

[83]  B. Haas,et al.  Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology , 2006, BMC Genomics.

[84]  Takeshi Hayashi,et al.  EM algorithm for Bayesian estimation of genomic breeding values , 2010, BMC Genetics.

[85]  Mark H. Wright,et al.  Large‐Scale Discovery of Gene‐Enriched SNPs , 2009 .

[86]  S. Knapp,et al.  Using molecular markers to estimate quantitative trait locus parameters: power and genetic variances for unreplicated and replicated progeny. , 1990, Genetics.

[87]  J. Sved Linkage disequilibrium and homozygosity of chromosome segments in finite populations. , 1971, Theoretical population biology.

[88]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[89]  H. Iwata,et al.  Accuracy of Genomic Selection Prediction in Barley Breeding Programs: A Simulation Study Based On the Real Single Nucleotide Polymorphism Data of Barley Breeding Lines , 2011 .

[90]  Jiannis Ragoussis,et al.  Genotyping technologies for all. , 2006, Drug discovery today. Technologies.

[91]  Eric S. Lander,et al.  Resolution of quantitative traits into Mendelian factors by using a complete linkage map of restriction fragment length polymorphisms , 1988, Nature.

[92]  Wenjun Zhang,et al.  Analysis of gene-derived SNP marker polymorphism in US wheat (Triticum aestivum L.) cultivars , 2008, Molecular Breeding.

[93]  M. Sorrells,et al.  Mapping quantitative trait loci for preharvest sprouting resistance in white wheat , 2009, Theoretical and Applied Genetics.

[94]  J. Woolliams,et al.  Inbreeding in genome-wide selection. , 2007, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[95]  P. Visscher,et al.  Increased accuracy of artificial selection by using the realized relationship matrix. , 2009, Genetics research.

[96]  I. Milne,et al.  Effects of ascertainment bias and marker number on estimations of barley diversity from high-throughput SNP genotype data , 2010, Theoretical and Applied Genetics.

[97]  M. Goddard,et al.  Can the same genetic markers be used in multiple breeds , 2006 .

[98]  R. Fernando,et al.  Genomic selection in admixed and crossbred populations. , 2010, Journal of animal science.

[99]  R. Lande The minimum number of genes contributing to quantitative variation between and within populations. , 1981, Genetics.

[100]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[101]  Andrés Legarra,et al.  Performance of Genomic Selection in Mice , 2008, Genetics.

[102]  M T Clegg,et al.  Genome comparisons reveal a dominant mechanism of chromosome number reduction in grasses and accelerated genome evolution in Triticeae , 2009, Proceedings of the National Academy of Sciences.

[103]  J. Woolliams,et al.  Reducing dimensionality for prediction of genome-wide breeding values , 2009, Genetics Selection Evolution.

[104]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[105]  B. J. Hayes,et al.  Genomic selection: Genomic selection , 2007 .

[106]  J. Jannink Selective Phenotyping to Accurately Map Quantitative Trait Loci , 2005 .

[107]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[108]  J. Woolliams,et al.  The Accuracy of Genomic Selection in Norwegian Red Cattle Assessed by Cross-Validation , 2009, Genetics.

[109]  Peter M Visscher,et al.  Prediction of individual genetic risk to disease from genome-wide association studies. , 2007, Genome research.

[110]  A. Charcosset,et al.  Experimental evaluation of several cycles of marker-assisted selection in maize , 2004, Euphytica.

[111]  C. R. Henderson,et al.  Best linear unbiased estimation and prediction under a selection model. , 1975, Biometrics.

[112]  J. Dekkers,et al.  Optimizing selection for quantitative traits with information on an identified locus in outbred populations , 1998 .

[113]  J. Dekkers,et al.  Optimal selection on two quantitative trait loci with linkage , 2002, Genetics Selection Evolution.

[114]  P. Sullivan,et al.  Single nucleotide polymorphism genotyping: biochemistry, protocol, cost and throughput , 2003, The Pharmacogenomics Journal.

[115]  J. Ohlrogge,et al.  Sampling the Arabidopsis Transcriptome with Massively Parallel Pyrosequencing1[W][OA] , 2007, Plant Physiology.

[116]  Kevin P. Smith,et al.  Molecular mapping and marker-assisted selection of genes for septoria speckled leaf blotch resistance in barley. , 2006, Phytopathology.

[117]  Shizhong Xu,et al.  An Empirical Bayes Method for Estimating Epistatic Effects of Quantitative Trait Loci , 2007, Biometrics.

[118]  J. Dekkers,et al.  A method to optimize selection on multiple identified quantitative trait loci , 2002, Genetics Selection Evolution.

[119]  J. Gibson Short-term gain at the expense of long-term response with selection of identified loci. , 1994 .

[120]  M. Goddard,et al.  LASSO with cross-validation for genomic selection. , 2009, Genetics research.

[121]  Shizhong Xu,et al.  Theoretical basis of the Beavis effect. , 2003, Genetics.

[122]  D Gianola,et al.  Reproducing kernel Hilbert spaces regression: a general framework for genetic evaluation. , 2009, Journal of animal science.

[123]  Michael E Goddard,et al.  Accuracy of genomic selection using stochastic search variable selection in Australian Holstein Friesian dairy cattle. , 2009, Genetics research.

[124]  S. Tanksley,et al.  RFLP Mapping in Plant Breeding: New Tools for an Old Science , 1989, Bio/Technology.

[125]  T. Meuwissen,et al.  Incorporating Desirable Genetic Characteristics From an Inferior Into a Superior Population Using Genomic Selection , 2009, Genetics.

[126]  Robenzon E. Lorenzana,et al.  Accuracy of genotypic value predictions for marker-based selection in biparental plant populations , 2009, Theoretical and Applied Genetics.

[127]  P. Lichtner,et al.  The impact of genetic relationship information on genomic breeding values in German Holstein cattle , 2010, Genetics Selection Evolution.

[128]  J. Kitzman,et al.  Repeat subtraction-mediated sequence capture from a complex genome. , 2010, The Plant journal : for cell and molecular biology.

[129]  D. Falconer,et al.  Introduction to Quantitative Genetics. , 1962 .

[130]  M. Mézard,et al.  Toward a Theory of Marker-Assisted Gene Pyramiding , 2004, Genetics.

[131]  Alain Charcosset,et al.  Usefulness of gene information in marker-assisted recurrent selection: A simulation appraisal , 2006 .

[132]  M. Kearsey,et al.  QTL analysis in plants; where are we now? , 1998, Heredity.

[133]  M. Goddard Genomic selection: prediction of accuracy and maximisation of long term response , 2009, Genetica.

[134]  M. Lynch,et al.  Genetics and Analysis of Quantitative Traits , 1996 .

[135]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[136]  J. Udall,et al.  SNP Discovery via Genomic Reduction, Barcoding, and 454‐Pyrosequencing in Amaranth , 2009 .

[137]  M. Goddard,et al.  Reliability of Genomic Predictions Across Multiple Populations , 2009, Genetics.

[138]  Jeffrey Perkel,et al.  SNP genotyping: six technologies that keyed a revolution , 2008, Nature Methods.

[139]  Sang Hong Lee,et al.  Predicting Unobserved Phenotypes for Complex Traits from Whole-Genome SNP Data , 2008, PLoS genetics.

[140]  M. Sorrells,et al.  Plant Breeding with Genomic Selection: Gain per Unit Time and Cost , 2010 .

[141]  Shizhong Xu QTL analysis in plants. , 2002, Methods in molecular biology.

[142]  R. Fernando,et al.  observed 50k SNP genotypes Genomic prediction of simulated multi-breed and purebred performance using , 2011 .

[143]  Stefano Lonardi,et al.  Development and implementation of high-throughput SNP genotyping in barley , 2009, BMC Genomics.

[144]  M. Goddard,et al.  Linkage Disequilibrium and Persistence of Phase in Holstein–Friesian, Jersey and Angus Cattle , 2008, Genetics.

[145]  M. Goddard,et al.  The Number of Loci That Affect Milk Production Traits in Dairy Cattle , 2007, Genetics.

[146]  Jean-Luc Jannink,et al.  Factors Affecting Accuracy From Genomic Selection in Populations Derived From Multiple Inbred Lines: A Barley Case Study , 2009, Genetics.

[147]  Peter J. Bradbury,et al.  The Genetic Architecture of Maize Flowering Time , 2009, Science.

[148]  J. Dekkers,et al.  Selection on multiple QTL with control of gene diversity and inbreeding for long-term benefit. , 2008, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.