Wide variations in neighbor-dependent substitution rates.

The pattern of 20,200 point substitutions in the 16 unique neighbor-pair environments has been determined from aligned gene/pseudogene sequences in the current database of human DNA sequences. Substitution rates, representing averages over those for different regions of the genome, are distributed over a 60-fold range with strong biases in particular neighbor-pair environments. The rates for substitutions involving the CG doublet are the most rapid overall, where changes of the C.G pair vary over a tenfold range depending on the type of substitution and the 5' neighbor-pair. In general, the rates are fastest in alternating purine-pyrimidine sequences and slowest in purine.pyrimidine tracts, suggesting that the frequencies of one or both key molecular misadventures that can occur during replication, dNTP misinsertion and transient misalignment, may be associated with structural alternations and flexibility of the backbone. By contrast, purine.pyrimidine tracts are less flexible, less prone to substitution, and therefore their proportions accumulate in sequences over time. Characteristic biases of the content and arrangement of oligonucleotide strings or tuples in all sequence elements, but particularly in non-coding regions, appear to be due to the pattern of different neighbor-dependent substitution rates. Computer simulations of numerous replicative cycles have been carried out with substitutions occurring on the same schedule found in this study for pseudogenes. Statistical analyses of tuple frequencies at periodic intervals during the simulation experiment indicate that sequences slowly change in lexical complexity toward a quasi-equilibrium state that corresponds to that for introns.

[1]  W. Hunter,et al.  The structure of guanosine-thymidine mismatches in B-DNA at 2.5-A resolution. , 1993, The Journal of biological chemistry.

[2]  R. Day,et al.  DNA-substrate sequence specificity of human G:T mismatch repair activity. , 1993, Nucleic acids research.

[3]  M. D. Topal,et al.  Mechanisms of chemical mutagenesis and carcinogenesis: effects on DNA replication of methylation at the O6-guanine position of dGTP. , 1983, Carcinogenesis.

[4]  R. Blake,et al.  Distribution and evolution of sequence characteristics in the E. coli genome. , 1986, Journal of biomolecular structure & dynamics.

[5]  Richard D. Blake,et al.  The Use of Multi-dimensional Scaling to Investigate Similarities Between Non-random Oligonucleotide Frequencies in Introns and Exons , 1993, Comput. Chem..

[6]  S. Neidle,et al.  Crystal structure of an oligonucleotide duplex containing G.G base pairs: influence of mispairing on DNA backbone conformation. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[7]  M. Behe,et al.  Effects of methylation on a synthetic polynucleotide: the B--Z transition in poly(dG-m5dC).poly(dG-m5dC). , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[8]  I Sauvaget,et al.  K-tuple frequency analysis: from intron/exon discrimination to T-cell epitope mapping. , 1990, Methods in enzymology.

[9]  P. Hagerman,et al.  Cytosine methylation can induce local distortions in the structure of duplex DNA. , 1992, Biochemistry.

[10]  G. Bernardi,et al.  The vertebrate genome: isochores and evolution. , 1993, Molecular biology and evolution.

[11]  S. Schiffman Introduction to Multidimensional Scaling , 1981 .

[12]  W. Pearson Rapid and sensitive sequence comparison with FASTP and FASTA. , 1990, Methods in enzymology.

[13]  Wen-Hsiung Li,et al.  Mutation rates differ among regions of the mammalian genome , 1989, Nature.

[14]  N. Sueoka Directional mutation pressure and neutral molecular evolution. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[15]  B. Glickman,et al.  Uracil-DNA glycosylase activity affects the mutagenicity of ethyl methanesulfonate: evidence for an alternative pathway of alkylation mutagenesis. , 1990, Mutation research.

[16]  Forrest W. Young,et al.  Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features , 1977 .

[17]  M. Behe,et al.  Oligopurine · oligopyrimidine tracts do not have the same conformation as analogous polypurine · polypyrimidines , 1991, Biopolymers.

[18]  W. Hunter,et al.  Structure of an adenine˙cytosine base pair in DNA and its implications for mismatch repair , 1986, Nature.

[19]  W. Hunter Crystallographic studies of DNA containing mismatches, modified and unpaired bases. , 1992, Methods in enzymology.

[20]  J. Slightom,et al.  Isolation and nucleotide sequence analysis of the beta-type globin pseudogene from human, gorilla and chimpanzee. , 1984, Journal of molecular biology.

[21]  S. Diekmann,et al.  DNA curvature in native and modified EcoRI recognition sites and possible influence upon the endonuclease cleavage reaction. , 1988, Journal of molecular biology.

[22]  M. Ehrlich,et al.  Spontaneous deamination of cytosine and 5-methylcytosine residues in DNA and replacement of 5-methylcytosine residues with cytosine residues. , 1990, Mutation research.

[23]  Jeffrey H. Miller,et al.  Mutagenic deamination of cytosine residues in DNA , 1980, Nature.

[24]  Wen-Hsiung Li,et al.  Evolution of DNA Sequences , 1985 .

[25]  T. Kunkel,et al.  Recent studies of the fidelity of DNA synthesis. , 1988, Biochimica et biophysica acta.

[26]  G. Leonard,et al.  Crystal structure and stability of a DNA duplex containing A(anti).G(syn) base-pairs. , 1989, Journal of molecular biology.

[27]  T. Ikemura Codon usage and tRNA content in unicellular and multicellular organisms. , 1985, Molecular biology and evolution.

[28]  A. Fersht,et al.  Kinetic basis of spontaneous mutation. Misinsertion frequencies, proofreading specificities and cost of proofreading by DNA polymerases of Escherichia coli. , 1982, Journal of molecular biology.

[29]  M. Behe,et al.  Methylated pyrimidines stabilize an alternating conformation of poly(dA-dU).poly(dA-dU). , 1985, Biochemistry.

[30]  T. Ikemura,et al.  Evident diversity of codon usage patterns of human genes with respect to chromosome banding patterns and chromosome numbers; relation between nucleotide sequence data and cytogenetic data. , 1991, Nucleic acids research.

[31]  S. James Press,et al.  International Encyclopedia of Statistics , 1978 .

[32]  S. Harvey,et al.  Base sequence, local helix structure, and macroscopic curvature of A-DNA and B-DNA. , 1986, The Journal of biological chemistry.

[33]  H. Echols,et al.  Fidelity mechanisms in DNA replication. , 1991, Annual review of biochemistry.

[34]  M. Bulmer,et al.  Neighboring base effects on substitution rates in pseudogenes. , 1986, Molecular biology and evolution.

[35]  J. Josse,et al.  Enzymatic synthesis of deoxyribonucleic acid. VIII. Frequencies of nearest neighbor base sequences in deoxyribonucleic acid. , 1961, The Journal of biological chemistry.

[36]  S. Tavaré,et al.  Codon preference and primary sequence structure in protein-coding regions. , 1989, Bulletin of mathematical biology.

[37]  M. D. Topal,et al.  Complementary base pairing and the origin of substitution mutations , 1976, Nature.

[38]  R. Blake,et al.  Stacking energies in DNA. , 1991, The Journal of biological chemistry.

[39]  A. Jeffreys,et al.  The primate ψβ1 gene: An ancient β-globin pseudogene , 1984 .

[40]  O. Kennard,et al.  SINGLE-CRYSTAL X-RAY DIFFRACTION STUDIES OF OLIGONUCLEOTIDES AND OLIGONUCLEOTIDE-DRUG COMPLEXES , 1991 .

[41]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[42]  R Nussinov,et al.  Nearest neighbor nucleotide patterns. Structural and biological implications. , 1981, The Journal of biological chemistry.

[43]  J. Sullivan,et al.  Differential sequence dynamics of homopolymeric and alternating AT tracts in a small plasmid DNA. , 1991, Biochemistry.