Genome nucleotide composition shapes variation in simple sequence repeats.

Simple sequence repeats (SSRs) or microsatellites are a common component of genomes but vary greatly across species in their abundance. We tested the hypothesis that this variation is due in part to AT/GC content of genomes, with genomes biased toward either high AT or high CG generating more short random repeats that are long enough to enhance expansion through slippage during replication. To test this hypothesis, we identified repeats with perfect tandem iterations of 1-6 bp from 25 protists with complete or near-complete genome sequences. As expected, the density and the frequency are highly related to genome AT content, with excellent fits to quadratic regressions with minima near a 50% AT content and rising toward both extremes. Within species, the same trends hold, except the limited variation in AT content within each species places each mainly on the descending (GC rich), middle, or ascending (AT rich) part of the curve. The base usages of repeat motifs are also significantly correlated with genome nucleotide compositions: Percentages of AT-rich motifs rise with the increase of genome AT content but vice versa for GC-rich subgroups. Amino acid homopolymer repeats also show the expected quadratic relationship, with higher abundance in species with AT content biased in either direction. Our results show that genome nucleotide composition explains up to half of the variance in the abundance and motif constitution of SSRs.

[1]  William Amos,et al.  Evidence for Nonindependent Evolution of Adjacent Microsatellites in the Human Genome , 2009, Journal of Molecular Evolution.

[2]  J. Flint,et al.  Heterozygosity increases microsatellite mutation rate, linking it to demographic history , 2008, BMC Genetics.

[3]  L. Waits,et al.  To what extent do microsatellite markers reflect genome‐wide genetic diversity in natural populations? , 2008, Molecular ecology.

[4]  Joel Dudley,et al.  MEGA: A biologist-centric software for evolutionary analysis of DNA and protein sequences , 2008, Briefings Bioinform..

[5]  H. Ellegren,et al.  Genome-wide analysis of microsatellite polymorphism in chicken circumventing the ascertainment bias. , 2008, Genome research.

[6]  I-Min A. Chen,et al.  The integrated microbial genomes (IMG) system in 2007: data content and analysis tool extensions , 2007, Nucleic Acids Res..

[7]  J. Strassmann,et al.  An Unusually Low Microsatellite Mutation Rate in Dictyostelium discoideum, an Organism With Unusually Abundant Microsatellites , 2007, Genetics.

[8]  Jan Mrázek,et al.  Simple sequence repeats in prokaryotic genomes , 2007, Proceedings of the National Academy of Sciences.

[9]  Passoupathy Rajendrakumar,et al.  Simple sequence repeats in organellar genomes of rice: frequency and distribution in genic and intergenic regions , 2007, Bioinform..

[10]  J. Cornuet,et al.  A third-generation microsatellite-based linkage map of the honey bee, Apis mellifera, and its comparison with the sequence-based physical map , 2007, Genome Biology.

[11]  G. King,et al.  Simple sequence repeats reveal uneven distribution of genetic diversity in chloroplast genomes of Brassica oleracea L. and (n = 9) wild relatives , 2007, Theoretical and Applied Genetics.

[12]  C. Schlötterer,et al.  Low abundance of Escherichia coli microsatellites is associated with an extremely low mutation rate , 2006, Journal of evolutionary biology.

[13]  Mark A DePristo,et al.  On the abundance, amino acid composition, and evolutionary dynamics of low-complexity regions in proteins. , 2006, Gene.

[14]  J. Mrázek Analysis of distribution indicates diverse functions of simple sequence repeats in Mycoplasma genomes. , 2006, Molecular biology and evolution.

[15]  Marta García-Gusano,et al.  Evaluation of amplified fragment length polymorphism and simple sequence repeats for tomato germplasm fingerprinting: utility for grouping closely related traditional cultivars. , 2006, Genome.

[16]  M. Ganal,et al.  A microsatellite marker based linkage map of tobacco , 2006, Theoretical and Applied Genetics.

[17]  Wieland Meyer,et al.  Survey of simple sequence repeats in completed fungal genomes. , 2005, Molecular biology and evolution.

[18]  M. Webster,et al.  Use of microsatellites for parentage and kinship analyses in animals. , 2005, Methods in enzymology.

[19]  H. Ellegren Microsatellites: simple sequences with complex evolution , 2004, Nature Reviews Genetics.

[20]  Korbinian Strimmer,et al.  APE: Analyses of Phylogenetics and Evolution in R language , 2004, Bioinform..

[21]  John M. Hancock The contribution of slippage-like processes to genome evolution , 1995, Journal of Molecular Evolution.

[22]  Christian Schlötterer,et al.  Two distinct modes of microsatellite mutation processes: evidence from the complete genomic sequences of nine species. , 2003, Genome research.

[23]  D. Forsdyke,et al.  Low-complexity segments in Plasmodium falciparum proteins are primarily nucleic acid level adaptations. , 2003, Molecular and biochemical parasitology.

[24]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[25]  M. Perutz,et al.  Aggregation of proteins with expanded glutamine and alanine repeats of the glutamine-rich and asparagine-rich domains of Sup35 and of the amyloid β-peptide of amyloid plaques , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[26]  M. Morgante,et al.  Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes , 2002, Nature Genetics.

[27]  S. Karlin,et al.  Amino acid runs in eukaryotic proteomes and disease associations , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[28]  L. Lipovich,et al.  Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. , 2001, Genome research.

[29]  M. V. Katti,et al.  Differential distribution of simple sequence repeats in eukaryotic genome sequences. , 2001, Molecular biology and evolution.

[30]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[31]  Stephen J Freeland,et al.  A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes , 2001, Genome Biology.

[32]  P. Arús,et al.  Simple sequence repeats in Cucumis mapping and map merging. , 2000, Genome.

[33]  J. Strassmann,et al.  Insertions, substitutions, and the origin of microsatellites. , 2000, Genetical research.

[34]  W. Doolittle,et al.  A kingdom-level phylogeny of eukaryotes based on combined protein data. , 2000, Science.

[35]  C. Schlötterer,et al.  Drosophila virilis has long and highly polymorphic microsatellites. , 2000, Molecular biology and evolution.

[36]  C. Aquadro,et al.  High density of long dinucleotide microsatellites in Drosophila subobscura. , 2000, Molecular biology and evolution.

[37]  J. Jurka,et al.  Microsatellites in different eukaryotic genomes: survey and analysis. , 2000, Genome research.

[38]  David Metzgar,et al.  Evidence for the Adaptive Evolution of Mutation Rates , 2000, Cell.

[39]  J. Strassmann,et al.  A Phylogenetic Perspective on Sequence Evolution in Microsatellite Loci , 2000, Journal of Molecular Evolution.

[40]  D. Metzgar,et al.  Selection against frameshift mutations limits microsatellite expansion in coding DNA. , 2000, Genome research.

[41]  Y. Kashi,et al.  Simple sequence repeats in Escherichia coli: abundance, distribution, composition, and polymorphism. , 2000, Genome research.

[42]  S. Baldauf A Search for the Origins of Animals and Fungi: Comparing and Combining Molecular Data , 1999, The American Naturalist.

[43]  S. Karlin,et al.  Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[44]  C. Schlötterer,et al.  Distribution of dinucleotide microsatellites in the Drosophila melanogaster genome. , 1999, Molecular biology and evolution.

[45]  M. Morgante,et al.  Intimate association of microsatellite repeats with retrotransposons and other dispersed repetitive elements in barley. , 1999, The Plant journal : for cell and molecular biology.

[46]  D. Falush,et al.  A threshold size for microsatellite expansion. , 1998, Molecular biology and evolution.

[47]  S Karlin,et al.  Compositional differences within and between eukaryotic genomes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[48]  D. Housman,et al.  The complex pathology of trinucleotide repeats. , 1997, Current opinion in cell biology.

[49]  T. F. Hansen,et al.  Phylogenies and the Comparative Method: A General Approach to Incorporating Phylogenetic Information into the Analysis of Interspecific Data , 1997, The American Naturalist.

[50]  R. Lothe,et al.  Microsatellite instability in human solid tumors. , 1997, Molecular medicine today.

[51]  John M. Hancock Simple sequences in a ‘minimal ’ genome , 1996, Nature Genetics.

[52]  William Amos,et al.  Microsatellites show mutational bias and heterozygote instability , 1996, Nature Genetics.

[53]  W. Messier,et al.  The birth of microsatellites , 1996, Nature.

[54]  John M. Hancock,et al.  Simple sequences and the expanding genome. , 1996, BioEssays : news and reviews in molecular, cellular and developmental biology.

[55]  D. Rubinsztein,et al.  Microsatellites are subject to directional evolution , 1996, Nature Genetics.

[56]  J. Weber,et al.  Alu repeats: a source for the genesis of primate microsatellites. , 1995, Genomics.

[57]  S. Karlin,et al.  Dinucleotide relative abundance extremes: a genomic signature. , 1995, Trends in genetics : TIG.

[58]  W Stephan,et al.  Possible role of natural selection in the formation of tandem-repetitive noncoding DNA. , 1994, Genetics.

[59]  J. Strassmann,et al.  Microsatellites and kinship. , 1993, Trends in ecology & evolution.

[60]  J. Weber Informativeness of human (dC-dA)n.(dG-dT)n polymorphisms. , 1990, Genomics.

[61]  G. Gutman,et al.  Slipped-strand mispairing: a major mechanism for DNA sequence evolution. , 1987, Molecular biology and evolution.