Effect of population size and unbalanced data sets on QTL detection using genome-wide association mapping in barley breeding germplasm

Over the past two decades many quantitative trait loci (QTL) have been detected; however, very few have been incorporated into breeding programs. The recent development of genome-wide association studies (GWAS) in plants provides the opportunity to detect QTL in germplasm collections such as unstructured populations from breeding programs. The overall goal of the barley Coordinated Agricultural Project was to conduct GWAS with the intent to couple QTL detection and breeding. The basic idea is that breeding programs generate a vast amount of phenotypic data and combined with cheap genotyping it should be possible to use GWAS to detect QTL that would be immediately accessible and used by breeding programs. There are several constraints to using breeding program-derived phenotype data for conducting GWAS namely: limited population size and unbalanced data sets. We chose the highly heritable trait heading date to study these two variables. We examined 766 spring barley breeding lines (panel #1) grown in balanced trials and a subset of 384 spring barley breeding lines (panel #2) grown in balanced and unbalanced trials. In panel #1, we detected three major QTL for heading date that have been detected in previous bi-parental mapping studies. Simulation studies showed that population sizes greater than 384 individuals are required to consistently detect QTL. We also showed that unbalanced data sets from panel #2 can be used to detect the three major QTL. However, unbalanced data sets resulted in an increase in the false-positive rate. Interestingly, one-step analysis performed better than two-step analysis in reducing the false-positive rate. The results of this work show that it is possible to use phenotypic data from breeding programs to detect QTL, but that careful consideration of population size and experimental design are required.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  Edward S. Buckler,et al.  Dwarf8 polymorphisms associate with variation in flowering time , 2001, Nature Genetics.

[3]  Kevin P. Smith,et al.  Effect of advanced cycle breeding on genetic diversity in barley breeding germplasm , 2008 .

[4]  Kevin P. Smith,et al.  Quantitative trait loci for Fusarium head blight resistance in barley detected in a two-rowed by six-rowed population , 2003 .

[5]  Qifa Zhang,et al.  Genome-wide association studies of 14 agronomic traits in rice landraces , 2010, Nature Genetics.

[6]  S. Wanamaker,et al.  Genome-wide SNP discovery and linkage analysis in barley based on genes responsive to abiotic stress , 2005, Molecular Genetics and Genomics.

[7]  T. Rocheford,et al.  Dissection of Maize Kernel Composition and Starch Production by Candidate Gene Association , 2004, The Plant Cell Online.

[8]  Brian R. Cullis,et al.  Spatial analysis of multi-environment early generation variety trials , 1998 .

[9]  Robin Thompson,et al.  The analysis of crop cultivar breeding and evaluation trials: an overview of current mixed model approaches , 2005, The Journal of Agricultural Science.

[10]  D. Mather,et al.  Mapping of disease resistance loci in barley on the basis of visual assessment of naturally occurring symptoms , 1998 .

[11]  P. McClean,et al.  Association mapping of iron deficiency chlorosis loci in soybean (Glycine max L. Merr.) advanced breeding lines , 2008, Theoretical and Applied Genetics.

[12]  Aaron J. Lorenz,et al.  Performance of Single Nucleotide Polymorphisms versus Haplotypes for Genome-Wide Association Analysis in Barley , 2010, PloS one.

[13]  Keyan Zhao,et al.  An Arabidopsis Example of Association Mapping in Structured Samples , 2006, PLoS genetics.

[14]  R. Bernardo Molecular Markers and Selection for Complex Traits in Plants: Learning from the Last 20 Years , 2008 .

[15]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[16]  William T B Thomas,et al.  INTERMEDIUM-C, a modifier of lateral spikelet fertility in barley, is an ortholog of the maize domestication gene TEOSINTE BRANCHED 1 , 2011, Nature Genetics.

[17]  F. Eeuwijk,et al.  Linkage Disequilibrium Mapping of Morphological, Resistance, and Other Agronomically Relevant Traits in Modern Spring Barley Cultivars , 2005, Molecular Breeding.

[18]  Kevin P. Smith,et al.  Genome-wide association mapping of Fusarium head blight resistance in contemporary barley breeding germplasm , 2011, Molecular Breeding.

[19]  S. Ceccarelli,et al.  Mixed model association scans of multi-environmental trial data reveal major loci controlling yield and yield related traits in Hordeum vulgare in Mediterranean environments , 2011, Theoretical and Applied Genetics.

[20]  J. Slate,et al.  Admixture and patterns of linkage disequilibrium in a free‐living vertebrate population , 2007, Journal of evolutionary biology.

[21]  L. Yan,et al.  The wheat and barley vernalization gene VRN3 is an orthologue of FT , 2006, Proceedings of the National Academy of Sciences.

[22]  D. Kudrna,et al.  Identification of QTLs Associated with Fusarium Head Blight Resistance in Barley Accession CIho 4196 , 2006 .

[23]  J. Holland,et al.  Estimating and Interpreting Heritability for Plant Breeding: An Update , 2010 .

[24]  Joshua A. Udall,et al.  Breeding for Quantitative Traits in Plants , 2003 .

[25]  G. Muehlbauer,et al.  Analysis of the chromosome 2(2H) region of barley associated with the correlated traits Fusarium head blight resistance and heading date , 2007, Theoretical and Applied Genetics.

[26]  S. Tingey,et al.  Whole genome scan detects an allelic variant of fad2 associated with increased oleic acid levels in maize , 2007, Molecular Genetics and Genomics.

[27]  C. Mundt,et al.  Effect of population size on the estimation of QTL: a test using resistance to barley stripe rust , 2005, Theoretical and Applied Genetics.

[28]  G. Evanno,et al.  Detecting the number of clusters of individuals using the software structure: a simulation study , 2005, Molecular ecology.

[29]  Jean-Luc Jannink,et al.  The emergence of whole genome association scans in barley. , 2009, Current opinion in plant biology.

[30]  Stefano Lonardi,et al.  Development and implementation of high-throughput SNP genotyping in barley , 2009, BMC Genomics.

[31]  J. Russell,et al.  Heading date QTL in a spring × winter barley cross evaluated in Mediterranean environments , 2008, Molecular Breeding.

[32]  Kevin P. Smith,et al.  Genome-wide SNPs and re-sequencing of growth habit and inflorescence genes in barley: implications for association mapping in germplasm arrays varying in size and structure , 2010, BMC Genomics.

[33]  J. Dudley,et al.  Epistatic Models Improve Prediction of Performance in Corn , 2009 .

[34]  Peter J. Bradbury,et al.  Assessment of Power and False Discovery Rate in Genome-Wide Association Studies using the BarleyCAP Germplasm , 2011 .

[35]  M. McMullen,et al.  A unified mixed-model method for association mapping that accounts for multiple levels of relatedness , 2006, Nature Genetics.

[36]  Hans-Peter Piepho,et al.  Comparison of Mixed-Model Approaches for Association Mapping , 2008, Genetics.

[37]  S. Chao,et al.  High throughput tissue preparation for large‐scale genotyping experiments , 2008, Molecular ecology resources.

[38]  M. Gore,et al.  Status and Prospects of Association Mapping in Plants , 2008 .

[39]  Zhiwu Zhang,et al.  Association Mapping: Critical Considerations Shift from Genotyping to Experimental Design , 2009, The Plant Cell Online.

[40]  Robin Thompson,et al.  Analyzing Variety by Environment Data Using Multiplicative Mixed Models and Adjustments for Spatial Field Trend , 2001, Biometrics.

[41]  Edward S. Buckler,et al.  TASSEL: software for association mapping of complex traits in diverse samples , 2007, Bioinform..

[42]  A. Long,et al.  The Lowdown on Linkage Disequilibrium , 2003, The Plant Cell Online.

[43]  Rajeev K. Varshney,et al.  Recent history of artificial outcrossing facilitates whole-genome association mapping in elite inbred crop varieties , 2006, Proceedings of the National Academy of Sciences.

[44]  M. Sorrells,et al.  Association Mapping of Kernel Size and Milling Quality in Wheat (Triticum aestivum L.) Cultivars , 2006, Genetics.

[45]  Zhiwu Zhang,et al.  Mixed linear model approach adapted for genome-wide association studies , 2010, Nature Genetics.

[46]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[47]  R. Bernardo Breeding for Quantitative Traits in Plants , 2002 .

[48]  F. V. van Eeuwijk,et al.  Linkage Disequilibrium Mapping of Yield and Yield Stability in Modern Spring Barley Cultivars , 2004, Genetics.

[49]  Timothy J. Close,et al.  Population Structure and Linkage Disequilibrium in U.S. Barley Germplasm: Implications for Association Mapping , 2010 .

[50]  Kevin P. Smith,et al.  The Genetics of Winterhardiness in Barley: Perspectives from Genome‐Wide Association Mapping , 2011 .

[51]  J. Oard,et al.  Association mapping of grain quality and flowering time in elite japonica rice germplasm , 2010 .

[52]  D. Balding,et al.  Genome-wide association mapping to candidate polymorphism resolution in the unsequenced barley genome , 2010, Proceedings of the National Academy of Sciences.

[53]  T. Rocheford,et al.  Natural variation in maize architecture is mediated by allelic differences at the PINOID co-ortholog barren inflorescence2. , 2009, The Plant journal : for cell and molecular biology.

[54]  Hans-Peter Piepho,et al.  Analysis of unbalanced data by mixed linear models using the MIXED procedure of the SAS System , 2005 .

[55]  Kevin L. Gunderson,et al.  Highly parallel genomic assays , 2006, Nature Reviews Genetics.

[56]  W. Barris,et al.  Extent of genome-wide linkage disequilibrium in Australian Holstein-Friesian cattle based on a high-density SNP panel , 2008, BMC Genomics.

[57]  Kevin P. Smith,et al.  Association mapping of spot blotch resistance in wild barley , 2010, Molecular Breeding.

[58]  A. Melchinger,et al.  Comparison of mixed-model approaches for association mapping in rapeseed, potato, sugar beet, maize, and Arabidopsis , 2009, BMC Genomics.