Candidate Genes and Genetic Architecture of Symbiotic and Agronomic Traits Revealed by Whole-Genome, Sequence-Based Association Genetics in Medicago truncatula

Genome-wide association study (GWAS) has revolutionized the search for the genetic basis of complex traits. To date, GWAS have generally relied on relatively sparse sampling of nucleotide diversity, which is likely to bias results by preferentially sampling high-frequency SNPs not in complete linkage disequilibrium (LD) with causative SNPs. To avoid these limitations we conducted GWAS with >6 million SNPs identified by sequencing the genomes of 226 accessions of the model legume Medicago truncatula. We used these data to identify candidate genes and the genetic architecture underlying phenotypic variation in plant height, trichome density, flowering time, and nodulation. The characteristics of candidate SNPs differed among traits, with candidates for flowering time and trichome density in distinct clusters of high linkage disequilibrium (LD) and the minor allele frequencies (MAF) of candidates underlying variation in flowering time and height significantly greater than MAF of candidates underlying variation in other traits. Candidate SNPs tagged several characterized genes including nodulation related genes SERK2, MtnodGRP3, MtMMPL1, NFP, CaML3, MtnodGRP3A and flowering time gene MtFD as well as uncharacterized genes that become candidates for further molecular characterization. By comparing sequence-based candidates to candidates identified by in silico 250K SNP arrays, we provide an empirical example of how reliance on even high-density reduced representation genomic makers can bias GWAS results. Depending on the trait, only 30–70% of the top 20 in silico array candidates were within 1 kb of sequence-based candidates. Moreover, the sequence-based candidates tagged by array candidates were heavily biased towards common variants; these comparisons underscore the need for caution when interpreting results from GWAS conducted with sparsely covered genomes.

[1]  J. David,et al.  Microsatellite diversity and broad scale geographic structure in a model legume: building a set of nested core collection for studying naturally occurring variation in Medicago truncatula , 2006, BMC Plant Biology.

[2]  W. Beavis QTL Analyses: Power, Precision, and Accuracy , 1997, Molecular Dissection of Complex Traits.

[3]  P. Rougé,et al.  The Medicago truncatula Lysine Motif-Receptor-Like Kinase Gene Family Includes NFP and New Nodule-Expressed Genes1[W] , 2006, Plant Physiology.

[4]  Jean-Michel Ané,et al.  3-Hydroxy-3-Methylglutaryl Coenzyme A Reductase1 Interacts with NORK and Is Crucial for Nodulation in Medicago truncatula , 2007, The Plant Cell Online.

[5]  P. Quail,et al.  The FAR1 locus encodes a novel nuclear protein specific to phytochrome A signaling. , 1999, Genes & development.

[6]  Robert J. Elshire,et al.  A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species , 2011, PloS one.

[7]  M. Stitt,et al.  Genome-wide association mapping of leaf metabolic profiles for dissecting complex traits in maize , 2012, Proceedings of the National Academy of Sciences.

[8]  M. Lepetit,et al.  Characterization of a dual-affinity nitrate transporter MtNRT1.3 in the model legume Medicago truncatula. , 2011, Journal of experimental botany.

[9]  T. Boller,et al.  Sinorhizobium meliloti-induced chitinase gene expression in Medicago truncatula ecotype R108-1: a comparison between symbiosis-specific class V and defence-related class IV chitinases , 2004, Planta.

[10]  Alvaro J. González,et al.  The Medicago Genome Provides Insight into the Evolution of Rhizobial Symbioses , 2011, Nature.

[11]  Andrew H. Paterson,et al.  Molecular Dissection of Complex Traits , 1997 .

[12]  C. Huyghe,et al.  A CONSTANS-like gene candidate that could explain most of the genetic variation for flowering date in Medicago truncatula , 2011, Molecular Breeding.

[13]  David B. Goldstein,et al.  Rare Variants Create Synthetic Genome-Wide Associations , 2010, PLoS biology.

[14]  J. Willemse,et al.  LysM Domain Receptor Kinases Regulating Rhizobial Nod Factor-Induced Infection , 2003, Science.

[15]  I. Olivieri,et al.  Spatial effects and rare outcrossing events in Medicago truncatula (Fabaceae) , 2001, Molecular ecology.

[16]  Bjarni J. Vilhjálmsson,et al.  Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines , 2010 .

[17]  Carlos D Bustamante,et al.  Ascertainment bias in studies of human genome-wide polymorphism. , 2005, Genome research.

[18]  E. Kondorosi,et al.  Glycine-rich proteins encoded by a nodule-specific gene family are implicated in different stages of symbiotic nodule development in Medicago spp. , 2002, Molecular plant-microbe interactions : MPMI.

[19]  T. Bhuvaneswari,et al.  Transient susceptibility of root cells in four common legumes to nodulation by rhizobia. , 1981, Plant physiology.

[20]  A. Long,et al.  The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. , 1999, Genome research.

[21]  C. Town,et al.  Genome-Wide Identification of Nodule-Specific Transcripts in the Model Legume Medicago truncatula 1 , 2002, Plant Physiology.

[22]  J. Downie,et al.  Calcium, kinases and nodulation signalling in legumes , 2004, Nature Reviews Molecular Cell Biology.

[23]  V. Hartenstein,et al.  Drosophila melanogaster , 2005 .

[24]  P. Cregan,et al.  Host Plant Effects on Nodulation and Competitiveness of the Bradyrhizobium japonicum Serotype Strains Constituting Serocluster 123 , 1989, Applied and environmental microbiology.

[25]  Edward S. Buckler,et al.  TASSEL: software for association mapping of complex traits in diverse samples , 2007, Bioinform..

[26]  Shizhong Xu,et al.  Theoretical basis of the Beavis effect. , 2003, Genetics.

[27]  Jason A. Corwin,et al.  Combining Genome-Wide Association Mapping and Transcriptional Networks to Identify Novel Genes Controlling Glucosinolates in Arabidopsis thaliana , 2011, PLoS biology.

[28]  Qian Qian,et al.  Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm , 2011, Nature Genetics.

[29]  K. Heath,et al.  Context dependence in the coevolution of plant and rhizobial mutualists , 2007, Proceedings of the Royal Society B: Biological Sciences.

[30]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[31]  L. Kochian,et al.  Association and Linkage Analysis of Aluminum Tolerance Genes in Maize , 2010, PloS one.

[32]  B. Gaut,et al.  Mapping Salinity Tolerance during Arabidopsis thaliana Germination and Seedling Growth , 2011, PloS one.

[33]  P. Etter,et al.  Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers , 2008, PloS one.

[34]  R. Punnett,et al.  The Genetical Theory of Natural Selection , 1930, Nature.

[35]  M. McMullen,et al.  A unified mixed-model method for association mapping that accounts for multiple levels of relatedness , 2006, Nature Genetics.

[36]  Wolfgang Busch,et al.  Integration of Spatial and Temporal Information During Floral Induction in Arabidopsis , 2005, Science.

[37]  G. Angenent,et al.  Ectopic Expression of the Petunia MADS Box Gene UNSHAVEN Accelerates Flowering and Confers Leaf-Like Characteristics to Floral Organs in a Dominant-Negative Manner , 2004, The Plant Cell Online.

[38]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[39]  Joseph K. Pickrell,et al.  The Genetics of Human Adaptation: Hard Sweeps, Soft Sweeps, and Polygenic Adaptation , 2010, Current Biology.

[40]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[41]  E. Triplett,et al.  Genetics of competition for nodulation of legumes. , 1992, Annual review of microbiology.

[42]  N. Young,et al.  Translating Medicago truncatula genomics to crop legumes. , 2009, Current opinion in plant biology.

[43]  K. Goto,et al.  FD, a bZIP Protein Mediating Signals from the Floral Pathway Integrator FT at the Shoot Apex , 2005, Science.

[44]  L. T. Evans Nitrogen in crop production: Edited by R.D. Hauck. American Society of Agronomy, Madison, Wisconsin, 1984, 804 pp. US$48.00 ISBN 0-89118-081-8 , 1986 .

[45]  Derin B. Wysham,et al.  Nuclear membranes control symbiotic calcium signaling of legumes , 2011, Proceedings of the National Academy of Sciences.

[46]  V. Smil Nitrogen in crop production: An account of global flows , 1999 .

[47]  P. Gamas,et al.  The MtMMPL1 Early Nodulin Is a Novel Member of the Matrix Metalloendoproteinase Family with a Role in Medicago truncatula Infection by Sinorhizobium meliloti1[W][OA] , 2007, Plant Physiology.

[48]  J. Prosperi,et al.  How multilocus genotypic pattern helps to understand the history of selfing populations: a case study in Medicago truncatula , 2008, Heredity.

[49]  N. Barton,et al.  Evolutionary quantitative genetics: how little do we know? , 1989, Annual review of genetics.

[50]  Zhiwu Zhang,et al.  Mixed linear model approach adapted for genome-wide association studies , 2010, Nature Genetics.

[51]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[52]  G. Béna,et al.  Partner choice in Medicago Truncatula–Sinorhizobium symbiosis , 2010, Proceedings of the Royal Society B: Biological Sciences.

[53]  Serban Nacu,et al.  Fast and SNP-tolerant detection of complex variants and splicing in short reads , 2010, Bioinform..

[54]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[55]  J. Rafalski,et al.  Association genetics in crop improvement. , 2010, Current opinion in plant biology.

[56]  J. W. Parsons Nitrogen in Crop Production , 1986 .

[57]  D. Schimel,et al.  Global patterns of terrestrial biological nitrogen (N2) fixation in natural ecosystems , 1999 .

[58]  R. Rose,et al.  Characterisation of the legume SERK-NIK gene superfamily including splice variants: Implications for development and defence , 2011, BMC Plant Biology.

[59]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[60]  Kevin R. Thornton,et al.  The Drosophila melanogaster Genetic Reference Panel , 2012, Nature.

[61]  N. Young,et al.  Whole-genome nucleotide diversity, recombination, and linkage disequilibrium in the model legume Medicago truncatula , 2011, Proceedings of the National Academy of Sciences.

[62]  P. Gamas,et al.  Use of a subtractive hybridization approach to identify new Medicago truncatula genes induced during root nodule development. , 1996, Molecular plant-microbe interactions : MPMI.

[63]  Keyan Zhao,et al.  Genome-Wide Association Mapping in Arabidopsis Identifies Previously Known Flowering Time and Pathogen Resistance Genes , 2005, PLoS genetics.