REMARKABLE SELECTIVE CONSTRAINTS ON EXONIC DINUCLEOTIDE REPEATS

Long dinucleotide repeats found in exons present a substantial mutational hazard: mutations at these loci occur often and generate frameshifts. Here, we provide clear and compelling evidence that exonic dinucleotides experience strong selective constraint. In humans, only 18 exonic dinucleotides have repeat lengths greater than six, which contrasts sharply with the genome‐wide distribution of dinucleotides. We genotyped each of these dinucleotides in 200 humans from eight 1000 Genomes Project populations and found a near‐absence of polymorphism. More remarkably, divergence data demonstrate that repeat lengths have been conserved across the primate phylogeny in spite of what is likely considerable mutational pressure. Coalescent simulations show that even a very low mutation rate at these loci fails to explain the anomalous patterns of polymorphism and divergence. Our data support two related selective constraints on the evolution of exonic dinucleotides: a short‐term intolerance for any change to repeat length and a long‐term prevention of increases to repeat length. In general, our results implicate purifying selection as the force that eliminates new, deleterious mutants at exonic dinucleotides. We briefly discuss the evolution of the longest exonic dinucleotide in the human genome—a 10 x CA repeat in fibroblast growth factor receptor‐like 1 (FGFRL1)—that should possess a considerably greater mutation rate than any other exonic dinucleotide and therefore generate a large number of deleterious variants.

[1]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2013 , 2012, Nucleic Acids Res..

[2]  C. Groves,et al.  Estimating the phylogeny and divergence times of primates using a supermatrix approach , 2009, BMC Evolutionary Biology.

[3]  L. Zhuang,et al.  Comparison of the receptor FGFRL1 from sea urchins and humans illustrates evolution of a zinc binding motif in the intracellular domain , 2009, BMC Biochemistry.

[4]  Emily H Turner,et al.  Targeted Capture and Massively Parallel Sequencing of Twelve Human Exomes , 2009, Nature.

[5]  Y. Okamura-Oho,et al.  Protein binding of a DRPLA family through arginine-glutamic acid dipeptide repeats is enhanced by extended polyglutamine. , 2000, Human molecular genetics.

[6]  L. Zhuang,et al.  Characterization of the first FGFRL1 mutation identified in a craniosynostosis patient. , 2009, Biochimica et biophysica acta.

[7]  Emily H Turner,et al.  Targeted Capture and Massively Parallel Sequencing of Twelve Human Exomes , 2009, Nature.

[8]  N. Gemmell,et al.  Measuring Microsatellite Conservation in Mammalian Evolution with a Phylogenetic Birth–Death Model , 2012, Genome biology and evolution.

[9]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[10]  Ryan J. Haasl,et al.  The number of alleles at a microsatellite defines the allele frequency spectrum and facilitates fast accurate estimation of theta. , 2010, Molecular biology and evolution.

[11]  K. Makova,et al.  A matter of life or death: how microsatellites emerge in and vanish from the human genome. , 2011, Genome research.

[12]  Robert Kofler,et al.  SciRoKo: a new tool for whole genome microsatellite search and investigation , 2007, Bioinform..

[13]  K. Makova,et al.  Mature Microsatellites: Mechanisms Underlying Dinucleotide Microsatellite Mutational Biases in Human Cells , 2013, G3: Genes, Genomes, Genetics.

[14]  Dawn M. Kilkenny,et al.  Fibroblast Growth Factor Receptor Like-1 (FGFRL1) Interacts with SHP-1 Phosphatase at Insulin Secretory Granules and Induces Beta-cell ERK1/2 Protein Activation* , 2013, The Journal of Biological Chemistry.

[15]  Mattias Jakobsson,et al.  Sequence determinants of human microsatellite variability , 2009, BMC Genomics.

[16]  S. Rosset,et al.  lobSTR: A short tandem repeat profiler for personal genomes , 2012, RECOMB.

[17]  Kateryna D. Makova,et al.  Distinct Mutational Behaviors Differentiate Short Tandem Repeats from Microsatellites in the Human Genome , 2012, Genome biology and evolution.

[18]  Swapan Mallick,et al.  A direct characterization of human mutation based on microsatellites , 2012, Nature Genetics.

[19]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[20]  Ryan J. Haasl,et al.  A genomic portrait of human microsatellite variation. , 2011, Molecular biology and evolution.

[21]  Kateryna D. Makova,et al.  What Is a Microsatellite: A Computational and Experimental Definition Based upon Repeat Mutational Behavior at A/T and GT/AC Repeats , 2010, Genome biology and evolution.

[22]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[23]  Ryan J. Haasl,et al.  Microsatellites as targets of natural selection. , 2013, Molecular biology and evolution.

[24]  E. Nevo,et al.  Microsatellites within genes: structure, function, and evolution. , 2004, Molecular biology and evolution.

[25]  B. Charlesworth,et al.  The effect of deleterious mutations on neutral molecular variation. , 1993, Genetics.

[26]  Life Technologies,et al.  A map of human genome variation from population-scale sequencing , 2011 .

[27]  M. Lynch Rate, molecular spectrum, and consequences of human mutation , 2010, Proceedings of the National Academy of Sciences.

[28]  P. Shannon,et al.  Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing , 2010, Science.

[29]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.