Skew of Mononucleotide Frequencies, Relative Abundance of Dinucleotides, and DNA Strand Asymmetry

Abstract. Based on 152 mitochondrial genomes and 36 bacterial chromosomes that have been completely sequenced, as well as three long contigs for human chromosomes 6, 21, and 22, we examined skews of mononucleotide frequencies and the relative abundance of dinucleotides in one DNA strand. Each group of these genomes has its own characteristics. Regarding mitochondrial genomes, both CpG and GpT are underrepresented, while either GpG or CpC or both are overrepresented. The relative frequency of nucleotide T vs A and of nucleotide G vs C is strongly skewed, due presumably to strand asymmetry in replication errors and unidirectional DNA replication from single origins. Exceptions are found in the plant and yeast mitochondrial genomes, each of which may replicate from multiple origins. Regarding bacterial genomes, the ``universal'' rule of CpG deficiency is restricted to archaebacteria and some eubacteria. In other eubacteria, the most underrepresented dinucleotide is either TpA or GpT. In general, there are significant T vs A and G vs C skews in each half of the bacterial genome, although these are almost exactly canceled out over the whole genome. Regarding human chromosomes 6, 21, and 22, dinucleotide CpG tends to be avoided. The relative frequency of mononucleotides exhibits conspicuous local skews, suggesting that each of these chromosomal segments contains more than one DNA replication origin. It is concluded that, when there are several replicons in a genomic region, not only the number of DNA replication origins but also the directionality is important and that the observed patterns of nucleotide frequencies in the genome strongly support the hypothesis of strand asymmetry in replication errors.

[1]  Shōzō Ōsawa,et al.  Evolution of the genetic code , 1995 .

[2]  T H Jukes,et al.  Evolutionary changes in the genetic code , 1990, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[3]  F. Foury,et al.  New features of mitochondrial DNA replication system in yeast and man. , 2000, Gene.

[4]  A. Bhagwat,et al.  Transcription-induced mutations: increase in C to T mutations in the nontranscribed strand during transcription in Escherichia coli. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[5]  D. A. Clayton,et al.  Pervasive CpG suppression in animal mitochondrial genomes. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[6]  S Karlin,et al.  Compositional differences within and between eukaryotic genomes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[7]  T. Kunkel Biological asymmetries and the fidelity of eukaryotic DNA replication , 1992, BioEssays : news and reviews in molecular, cellular and developmental biology.

[8]  S. Ohno,et al.  Universal rule for coding sequence construction: TA/CG deficiency-TG/CT excess. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[9]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[10]  Chung-I Wu DNA strand asymmetry , 1991, Nature.

[11]  A Grigoriev,et al.  Analyzing genomes with cumulative skew diagrams. , 1998, Nucleic acids research.

[12]  T. Ikemura Codon usage and tRNA content in unicellular and multicellular organisms. , 1985, Molecular biology and evolution.

[13]  A Ando,et al.  A boundary of long-range G + C% mosaic domains in the human MHC locus: pseudoautosomal boundary-like sequence exists near the boundary. , 1995, Genomics.

[14]  Temple F. Smith,et al.  Patterns of Genome Organization in Bacteria , 1998, Science.

[15]  Chung-I Wu,et al.  Inequality in mutation rates of the two strands of DNA , 1987, Nature.

[16]  J. Lobry Asymmetric substitution patterns in the two DNA strands of bacteria. , 1996, Molecular biology and evolution.

[17]  S. Karlin,et al.  Over- and under-representation of short oligonucleotides in DNA sequences. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[18]  G Bernardi,et al.  The mosaic genome of warm-blooded vertebrates. , 1985, Science.

[19]  S. Karlin,et al.  Dinucleotide relative abundance extremes: a genomic signature. , 1995, Trends in genetics : TIG.

[20]  H. Ochman,et al.  Strand asymmetries in DNA evolution. , 1997, Trends in genetics : TIG.

[21]  K. H. Wolfe,et al.  Base Composition Skews, Replication Orientation, and Gene Orientation in 12 Prokaryote Genomes , 1998, Journal of Molecular Evolution.

[22]  M. Kimura,et al.  A model of evolutionary base substitutions and its application with special reference to rapid change of pseudogenes. , 1981, Genetics.

[23]  S. Aota,et al.  Giant G+C% mosaic structures of the human genome found by arrangement of GenBank human DNA sequences according to genetic positions. , 1990, Genomics.

[24]  B. Stillman,et al.  Anatomy of a DNA replication fork revealed by reconstitution of SV40 DNA replication in vitro , 1994, Nature.

[25]  H. Doi,et al.  Promotion of evolution: disparity in the frequency of strand-specific misreading between the lagging and leading DNA strands enhances disproportionate accumulation of mutations. , 1992, Journal of theoretical biology.

[26]  H Philippe,et al.  Origin of replication of Thermotoga maritima. , 2000, Trends in genetics : TIG.

[27]  S. Cross,et al.  CpG islands and genes. , 1995, Current opinion in genetics & development.