Preference of simple sequence repeats in coding and non-coding regions of Arabidopsis thaliana

MOTIVATION Simple sequence repeats or microsatellites have been found abundantly in many genomes. However, the significance of distribution preference has not been completely understood. Completion of the Arabidopsis genome sequencing allows us to better understand and characterize microsatellites. RESULTS Microsatellite distribution was more abundant in 5'-flanking regions of genes compared with that expected in the whole genome, with an over-representation of AG and AAG repeats; there were clear differences from distributions in 3'-flanks and coding fractions, where triplet frequencies evidently corresponded to codon usage. We identified 1140 full-length genes that contained at least one locus of AG or AAG repeats in their upstream sequences, and whose functional characteristics were significantly associated with the repeats. This observation indicates that selective pressure markedly differed in the three transcribed regions, with positive selection of AG and AAG repeats in 5'-flanks close to those genes whose products are preferentially involved in transcription.

[1]  D. Twell,et al.  The 5′-Untranslated Region of the ntp303 Gene Strongly Enhances Translation during Pollen Tube Growth, But Not during Pollen Maturation , 2002, Plant Physiology.

[2]  Yujun Zhang,et al.  Sequence and analysis of rice chromosome 4 , 2002, Nature.

[3]  L. Singh,et al.  Genome-wide analysis of microsatellite repeats in humans: their abundance and density in specific genomic regions , 2003, Genome Biology.

[4]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[5]  R. Kornberg,et al.  Activation of yeast RNA polymerase II transcription by a thymidine-rich upstream element in vitro. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Branko Borstnik,et al.  Tandem repeats in protein coding regions of primate genes. , 2002, Genome research.

[7]  I. Ashikawa Gene-associated CpG islands in plants as revealed by analyses of genomic sequences. , 2001, The Plant journal : for cell and molecular biology.

[8]  R. Martienssen,et al.  DNA methylation and epigenetic inheritance in plants and filamentous fungi. , 2001, Science.

[9]  M. R. O'Brian,et al.  Identification of a Soybean Protein That Interacts with GAGA Element Dinucleotide Repeat DNA1 , 2002, Plant Physiology.

[10]  J. Epplen,et al.  Genomic simple repetitive DNAs are targets for differential binding of nuclear proteins , 1996, FEBS letters.

[11]  D. Tautz,et al.  Slippage synthesis of simple sequence DNA. , 1992, Nucleic acids research.

[12]  Goutam Gupta,et al.  DNA repeats in the human genome , 2004, Genetica.

[13]  D. Metzgar,et al.  Selection against frameshift mutations limits microsatellite expansion in coding DNA. , 2000, Genome research.

[14]  J. Jurka,et al.  Simple repetitive DNA sequences from primates: Compilation and analysis , 1995, Journal of Molecular Evolution.

[15]  T. Gojobori,et al.  The genome sequence and structure of rice chromosome 1 , 2002, Nature.

[16]  John M. Hancock The contribution of slippage-like processes to genome evolution , 1995, Journal of Molecular Evolution.

[17]  Y. Kashi,et al.  Simple sequence repeats as a source of quantitative genetic variation. , 1997, Trends in genetics : TIG.

[18]  N H Terry,et al.  "Mitotic drive" of expanded CTG repeats in myotonic dystrophy type 1 (DM1). , 2001, Human molecular genetics.

[19]  M. Morgante,et al.  Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes , 2002, Nature Genetics.

[20]  D. Ricke,et al.  Nonrandom patterns of simple and cryptic triplet repeats in coding and noncoding sequences. , 1995, Genomics.

[21]  Y. Kashi,et al.  Simple sequence repeats in Escherichia coli: abundance, distribution, composition, and polymorphism. , 2000, Genome research.

[22]  Robert I. Richards,et al.  Dynamic mutations: A new class of mutations causing human disease , 1992, Cell.

[23]  J A Koziol,et al.  Evolution of the genome and the genetic code: selection at the dinucleotide level by methylation and polyribonucleotide cleavage. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[24]  J. Jurka,et al.  Microsatellites in different eukaryotic genomes: survey and analysis. , 2000, Genome research.

[25]  L. Lipovich,et al.  Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. , 2001, Genome research.

[26]  Graziano Pesole,et al.  UTRdb and UTRsite: specialized databases of sequences and functional elements of 5' and 3' untranslated regions of eukaryotic mRNAs , 2000, Nucleic Acids Res..

[27]  J. Mullet,et al.  Identification of a sequence-specific DNA binding factor required for transcription of the barley chloroplast blue light-responsive psbD-psbC promoter. , 1995, The Plant cell.