The Power of Inbreeding: NGS-Based GWAS of Rice Reveals Convergent Evolution during Rice

ABSTRACT Low-coveragewhole-genomesequencingisaneffectivestrategyforgenome-wideassociationstudiesinhu-mans,duetotheavailabilityoflargereferencepanelsforgenotypeimputation.However,itisunclearwhetherthis strategy can be utilized in other species without reference panels. Using simulations, we show that thisapproachisevenmorerelevantininbredspeciessuchasrice(OryzasativaL.),whichareeffectivelyhaploid,allowingeasyhaplotypeconstructionandimputation-basedgenotypecalling,evenwithouttheavailabilityoflargereferencepanels.Wesequenced203ricevarietieswithwell-characterizedphenotypesfromtheUnitedStatesDepartmentofAgricultureRiceMini-CoreCollectionatanaveragedepthof1.53andusedthedataformapping three traits. For the first two traits, amylose content and seed length, our approach leads to directidentificationofthepreviouslyidentifiedcausalSNPsinthemajor-effectloci.Forthethirdtrait,pericarpcolor,an important trait underwent selection during domestication, we identified a new major-effect locus.AlthoughknownlocicanexplaincolorvariationinthevarietiesoftwomainsubspeciesofAsiandomesticatedrice,japonicaandindica,thenewlocusidentifiedisuniquetoanotherdomesticatedricesubgroup,aus,andtogether with existing loci, can fully explain the major variation in pericarp color in aus. Our discovery of aunique genetic basis of white pericarp in aus provides an example of convergent evolution during ricedomesticationand suggests thatausmayhaveadomesticationhistoryindependent ofjaponicaand indica.Key words: inbreeding, GWAS, rice, pericarp colorWang H., Xu X., Vieira F.G., Xiao Y., Li Z., Wang J., Nielsen R., and Chu C. (2016). The Power of Inbreeding:NGS-Based GWAS of Rice Reveals Convergent Evolution during Rice Domestication. Mol. Plant. 9, 975–985.

[1]  Xuehui Huang,et al.  Genetic discovery for oil production and quality in sesame , 2015, Nature Communications.

[2]  Hui Xiang,et al.  Erratum: Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean , 2015, Nature Biotechnology.

[3]  Wensheng Wang,et al.  SNP-Seek database of SNPs derived from 3000 rice genomes , 2014, Nucleic Acids Res..

[4]  R. Nielsen,et al.  ANGSD: Analysis of Next Generation Sequencing Data , 2014, BMC Bioinformatics.

[5]  M. Long,et al.  The genome sequence of African rice (Oryza glaberrima) and evidence for independent domestication , 2014, Nature Genetics.

[6]  Wei Chen,et al.  Genome-wide association analyses provide genetic and biochemical insights into natural variation in rice metabolism , 2014, Nature Genetics.

[7]  Matteo Fumagalli,et al.  ngsTools: methods for population genetics analyses from next-generation sequencing data , 2014, Bioinform..

[8]  Stephen D. Turner,et al.  qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots , 2014, bioRxiv.

[9]  Jianbing Yan,et al.  Metabolome-based genome-wide association study of maize kernel leads to novel biochemical insights , 2014, Nature Communications.

[10]  M. Fumagalli,et al.  Assessing the Effect of Sequencing Depth and Sample Size in Population Genetics Inferences , 2013, PloS one.

[11]  T. Korneliussen,et al.  Estimating Individual Admixture Proportions from Next Generation Sequencing Data , 2013, Genetics.

[12]  R. Nielsen,et al.  Estimating inbreeding coefficients from NGS data: Impact on genotype calling and allele frequency estimation , 2013, Genome research.

[13]  D. Schwartz,et al.  Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data , 2013, Rice.

[14]  C. T. Hash,et al.  Population genomic and genome-wide association studies of agroclimatic traits in sorghum , 2012, Proceedings of the National Academy of Sciences.

[15]  Xiaohong Yang,et al.  Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels , 2012, Nature Genetics.

[16]  Jun Wang,et al.  SNP Calling, Genotype Calling, and Sample Allele Frequency Estimation from New-Generation Sequencing Data , 2012, PloS one.

[17]  M. Stephens,et al.  Genome-wide Efficient Mixed Model Analysis for Association Studies , 2012, Nature Genetics.

[18]  L. Liang,et al.  Extremely low-coverage sequencing and imputation increases power for genome-wide association studies , 2012, Nature Genetics.

[19]  D. Reich,et al.  Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture , 2012, Genome research.

[20]  A. McClung,et al.  Allelic Analysis of Sheath Blight Resistance with Association Mapping in Rice , 2012, PloS one.

[21]  Aaron Jackson,et al.  Unraveling the Complex Trait of Harvest Index with Association Mapping in Rice (Oryza sativa L.) , 2012, PloS one.

[22]  Qian Qian,et al.  Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm , 2011, Nature Genetics.

[23]  A. McClung,et al.  Genetic variation and association mapping of silica concentration in rice hulls using a germplasm collection , 2011, Genetica.

[24]  Heng Li,et al.  A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data , 2011, Bioinform..

[25]  Mark H. Wright,et al.  Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa , 2011, Nature communications.

[26]  R. Hudson,et al.  Two Evolutionary Histories in the Genome of Rice: the Roles of Domestication Genes , 2011, PLoS genetics.

[27]  G. Abecasis,et al.  Low-coverage sequencing: implications for design of complex trait association studies. , 2011, Genome research.

[28]  Martin Goodson,et al.  Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. , 2011, Genome research.

[29]  A. McClung,et al.  Mapping QTLs for improving grain yield using the USDA rice mini-core collection , 2011, Planta.

[30]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[31]  M. Jia,et al.  Genotypic and phenotypic characterization of genetic differentiation and diversity in the USDA rice mini-core collection , 2010, Genetica.

[32]  Jialing Yao,et al.  Linking differential domain functions of the GS3 protein to natural variation of grain size in rice , 2010, Proceedings of the National Academy of Sciences.

[33]  C. Bustamante,et al.  Genomic Diversity and Introgression in O. sativa Reveal the Impact of Domestication and Breeding on the Rice Genome , 2010, PloS one.

[34]  E. Paradis pegas: an R package for population genetics with an integrated-modular approach , 2010, Bioinform..

[35]  J. Bennetzen,et al.  Do genetic recombination and gene density shape the pattern of DNA elimination in rice long terminal repeat retrotransposons? , 2009, Genome research.

[36]  Ming-Hsuan Chen,et al.  Genetic Assessment of a Mini‐Core Subset Developed from the USDA Rice Genebank , 2009 .

[37]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[38]  H. Agrama,et al.  Association mapping of stigma and spikelet characteristics in rice (Oryza sativa L.) , 2009, Molecular Breeding.

[39]  Heng Li,et al.  BIOINFORMATICS ORIGINAL PAPER , 2022 .

[40]  M. Yano,et al.  DNA changes tell us about rice domestication. , 2009, Current opinion in plant biology.

[41]  T. Izawa,et al.  Inference of the japonica rice domestication process from the distribution of six functional nucleotide polymorphisms of domestication-related genes in various landraces and modern cultivars. , 2008, Plant & cell physiology.

[42]  B. Browning,et al.  Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. , 2007, American journal of human genetics.

[43]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[44]  Ryan D. Hernandez,et al.  Genome-Wide Patterns of Nucleotide Polymorphism in Domesticated Rice , 2007, PLoS genetics.

[45]  C. Bustamante,et al.  Global Dissemination of a Single Mutation Conferring White Pericarp in Rice , 2007, PLoS genetics.

[46]  S. Iida,et al.  The Rc and Rd genes are involved in proanthocyanidin synthesis in rice pericarp. , 2006, The Plant journal : for cell and molecular biology.

[47]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[48]  M. Purugganan,et al.  Selection Under Domestication: Evidence for a Sweep in the Rice Waxy Genomic Region , 2006, Genetics.

[49]  Bin Han,et al.  GS3, a major QTL for grain length and weight and minor QTL for grain width and thickness in rice, encodes a putative transmembrane protein , 2006, Theoretical and Applied Genetics.

[50]  M. Thomson,et al.  Caught Red-Handed: Rc Encodes a Basic Helix-Loop-Helix Protein Conditioning Red Pericarp in Rice[W][OA] , 2006, The Plant Cell Online.

[51]  M. Daly,et al.  Haploview: analysis and visualization of LD and haplotype maps , 2005, Bioinform..

[52]  B. W. Shirley Flavonoids in seeds and grains: physiological function, agronomic importance and the genetics of biosynthesis , 1998, Seed Science Research.

[53]  Y. Sano,et al.  A single base change altered the regulation of the Waxy gene at the posttranscriptional level during the domestication of rice. , 1998, Molecular biology and evolution.

[54]  M T Clegg,et al.  Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[55]  J. Zhang,et al.  The amylose content in rice endosperm is related to the post-transcriptional regulation of the waxy gene. , 1995, The Plant journal : for cell and molecular biology.

[56]  Min Zhang,et al.  Worldwide Genetic Diversity for Mineral Element Concentrations in Rice Grain , 2015 .

[57]  J. Hermisson,et al.  Bioinformatics Applications Note Genetics and Population Analysis Msms: a Coalescent Simulation Program including Recombination, Demographic Structure and Selection at a Single Locus , 2022 .