Computational analysis of mutation spectra

Mutation frequencies vary along a nucleotide sequence, and nucleotide positions with an exceptionally high mutation frequency are called hotspots. Mutation hotspots in DNA often reflect intrinsic properties of the mutation process, such as the specificity with which mutagens interact with nucleic acids and the sequence-specificity of DNA repair/replication enzymes. They might also reflect structural and functional features of target protein or RNA sequences in which they occur. The determinants of mutation frequency and specificity are complex and there are many analytical methods for their study. This paper discusses computational approaches to analysing mutation spectra (distribution of mutations along the target genes) that include many detectable (mutable) positions. The following methods are reviewed: mutation hotspot prediction; pairwise and multiple comparisons of mutation spectra; derivation of a consensus sequence; and analysis of correlation between nucleotide sequence features and mutation spectra. Spectra of spontaneous and induced mutations are used for illustration of the complexities and pitfalls of such analyses. In general, the DNA sequence context of mutation hotspots is a fingerprint of interactions between DNA and DNA repair/replication/modification enzymes, and the analysis of hotspot context provides evidence of such interactions.

[1]  J. Drake,et al.  Modulation of mutation rates in bacteriophage T4 by a base-pair change a dozen nucleotides removed. , 1984, Journal of molecular biology.

[2]  D. Goodsell,et al.  Bending and curvature calculations in B-DNA. , 1994, Nucleic acids research.

[3]  J N Anderson,et al.  Conserved DNA structures in origins of replication. , 1990, Nucleic acids research.

[4]  N A Kolchanov,et al.  Somatic hypermutagenesis in immunoglobulin genes. II. Influence of neighbouring base sequences on mutagenesis. , 1992, Biochimica et biophysica acta.

[5]  C Saccone,et al.  Linguistic analysis of nucleotide sequences: algorithms for pattern recognition and analysis of codon strategy. , 1996, Methods in enzymology.

[6]  Luciano Milanesi,et al.  The subclass approach for mutational spectrum analysis: application of the SEM algorithm. , 1998, Journal of theoretical biology.

[7]  P. Hainaut,et al.  Mutation spectra resulting from carcinogenic exposure: from model systems to cancer-related genes. , 1998, Recent results in cancer research. Fortschritte der Krebsforschung. Progres dans les recherches sur le cancer.

[8]  T. D. Schneider,et al.  Quantitative analysis of the relationship between nucleotide sequence and functional activity. , 1986, Nucleic acids research.

[9]  M. Goodman Error-prone repair DNA polymerases in prokaryotes and eukaryotes. , 2002, Annual review of biochemistry.

[10]  V. Noskov,et al.  Base analog 6-N-hydroxylaminopurine mutagenesis in the yeast Saccharomyces cerevisiae is controlled by replicative DNA polymerases. , 1996, Mutation research.

[11]  M Krawczak,et al.  Neighboring-nucleotide effects on the rates of germ-line single-base-pair substitution in human genes. , 1998, American journal of human genetics.

[12]  Adams Wt,et al.  Statistical test for the comparison of samples from mutational spectra , 1987 .

[13]  David Landsman,et al.  Curved DNA in promoter sequences , 1999, Silico Biol..

[14]  I. Rogozin,et al.  Theoretical analysis of mutation hotspots and their DNA sequence context specificity. , 2003, Mutation research.

[15]  R. Schaaper,et al.  The role of the mutT gene of Escherichia coli in maintaining replication fidelity. , 1997, FEMS microbiology reviews.

[16]  B. Strauss,et al.  Frameshift mutation, microsatellites and mismatch repair. , 1999, Mutation research.

[17]  P. Modrich DNA mismatch correction. , 1987, Annual review of biochemistry.

[18]  G. Danieli,et al.  Large majority of single‐nucleotide mutations along the dystrophin gene can be explained by more than one mechanism of mutagenesis , 1997, Human mutation.

[19]  C. Milstein,et al.  Discriminating intrinsic and antigen-selected mutational hotspots in immunoglobulin V genes. , 1993, Immunology today.

[20]  M. Wabl,et al.  Critical test of hot spot motifs for immunoglobulin hypermutation , 1997, European journal of immunology.

[21]  S. Tonegawa,et al.  Somatic generation of antibody diversity. , 1976, Nature.

[22]  B W Glickman,et al.  Mutational specificity of alkylating agents and the influence of DNA repair , 1990, Environmental and molecular mutagenesis.

[23]  K. Dixon,et al.  Sequence specificity of point mutations induced during passage of a UV-irradiated shuttle vector plasmid in monkey cells , 1986, Molecular and cellular biology.

[24]  D. Cooper,et al.  Meta‐analysis of indels causing human genetic disease: mechanisms of mutagenesis and the role of local DNA sequence complexity , 2003, Human mutation.

[25]  M. D. Topal,et al.  O6-methylguanine mutation and repair is nonuniform. Selection for DNA most interactive with O6-methylguanine. , 1986, The Journal of biological chemistry.

[26]  T R Skopek,et al.  Statistical test for the comparison of samples from mutational spectra. , 1987, Journal of molecular biology.

[27]  C Béroud,et al.  UMD (Universal Mutation Database): A generic software to build and analyze locus‐specific databases , 2000, Human mutation.

[28]  J. Parry,et al.  An exploratory analysis of multiple mutation spectra. , 2002, Mutation research.

[29]  A. Albertini,et al.  On the formation of spontaneous deletions: The importance of short sequence homologies in the generation of large deletions , 1982, Cell.

[30]  Neal F. Cariello,et al.  Databases and software for the analysis of mutations in the human p53 gene, human hprt gene and both the lacI and lacZ gene in transgenic rodents , 1998, Nucleic Acids Res..

[31]  G. B. Golding,et al.  Sequence-directed mutagenesis: evidence from a phylogenetic history of human alpha-interferon genes. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Jeffrey Miller,et al.  Genetic Studies of Lac Repressor: 4000 Single Amino Acid Substitutions and Analysis of the Resulting Phenotypes on the Basis of the Protein Structure , 1996, German Conference on Bioinformatics.

[33]  M. Kawata,et al.  An in vivo approach to identifying sequence context of 8-oxoguanine mutagenesis. , 2001, Biochemical and biophysical research communications.

[34]  S. Lacks,et al.  Generation of deletions in pneumococcal mal genes cloned in Bacillus subtilis. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[35]  B. Bertocci,et al.  AID-dependent somatic hypermutation occurs as a DNA single-strand event in the BL2 cell line , 2002, Nature Immunology.

[36]  M S Gelfand,et al.  Prediction of function in DNA sequence analysis. , 1995, Journal of computational biology : a journal of computational molecular cell biology.

[37]  M. Yaniv,et al.  Simian virus 40 illegitimate recombination occurs near short direct repeats. , 1984, Journal of molecular biology.

[38]  Alberto Martin,et al.  Activation-induced cytidine deaminase turns on somatic hypermutation in hybridomas , 2002, Nature.

[39]  W W Piegorsch,et al.  Statistical approaches for analyzing mutational spectra: some recommendations for categorical data. , 1994, Genetics.

[40]  L. S. Ripley,et al.  Frameshift mutation: determinants of specificity. , 1990, Annual review of genetics.

[41]  T. Kunkel,et al.  Correlation of somatic hypermutation specificity and A-T base pair substitution errors by DNA polymerase η during copying of a mouse immunoglobulin κ light chain transgene , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[42]  E. Friedberg,et al.  The many faces of DNA polymerases: strategies for mutagenesis and for mutational avoidance. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[43]  I King Jordan,et al.  Transposable elements and the evolution of eukaryotic complexity. , 2002, Current issues in molecular biology.

[44]  A. Sarasin,et al.  Molecular analysis of DNA junctions produced by illegitimate recombination in human cells. , 1992, Nucleic acids research.

[45]  L. Wysocki,et al.  Di- and trinucleotide target preferences of somatic mutagenesis in normal and autoreactive B cells. , 1996, Journal of immunology.

[46]  M. Inouye,et al.  Frameshift mutations and the genetic code. This paper is dedicated to Professor Theodosius Dobzhansky on the occasion of his 66th birthday. , 1966, Cold Spring Harbor symposia on quantitative biology.

[47]  E. Hovig,et al.  Disentangling the perturbational effects of amino acid substitutions in the DNA-binding domain of p53 , 1999, Human Genetics.

[48]  I. Rogozin,et al.  Response 1 to 'Smaller role for pol η?' , 2001, Nature Immunology.

[49]  A. Jeffreys,et al.  Crossover breakpoint mapping identifies a subtelomeric hotspot for male meiotic recombination. , 2000, Human molecular genetics.

[50]  U. Storb,et al.  Progress in understanding the mechanism and consequences of somatic hypermutation , 1998, Immunological reviews.

[51]  Luciano Milanesi,et al.  10 – Prediction of Human Gene Structure , 1998 .

[52]  S. Benzer,et al.  ON THE TOPOGRAPHY OF THE GENETIC FINE STRUCTURE. , 1961, Proceedings of the National Academy of Sciences of the United States of America.

[53]  R. Staden,et al.  Both DNA strands of antibody genes are hypermutation targets. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Babenko Vn,et al.  Use of a rank correlation coefficient for comparing mutational spectra , 1999 .

[55]  C. Lawrence,et al.  The T-T pyrimidine (6-4) pyrimidinone UV photoproduct is much less mutagenic in yeast than in Escherichia coli. , 1995, Nucleic acids research.

[56]  Igor B. Rogozin,et al.  Regression trees for analysis of mutational spectra in nucleotide sequences , 1999, Bioinform..

[57]  L. Prakash,et al.  Translesion DNA synthesis in eukaryotes: a one- or two-polymerase affair. , 2002, Genes & development.

[58]  T. Kepler,et al.  Plasticity under somatic mutation in antigen receptors. , 1998, Current topics in microbiology and immunology.

[59]  R. Fuchs,et al.  Greater susceptibility to mutations in lagging strand of DNA replication in Escherichia coli than in leading strand. , 1993, Science.

[60]  A I Wacey,et al.  Mutation databases on the Web. , 1998, Journal of medical genetics.

[61]  Timothy J. Foster,et al.  Three Tn10-associated excision events: Relationship to transposition and role of direct and inverted repeats , 1981, Cell.

[62]  T R Skopek,et al.  DNA base changes and alkylation following in vivo exposure of Escherichia coli to N-methyl-N-nitrosourea or N-ethyl-N-nitrosourea. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[63]  P. Casali,et al.  B Cell Receptor Engagement and T Cell Contact Induce bcl-6 Somatic Hypermutation in Human B Cells: Identity with Ig Hypermutation1 , 2000, The Journal of Immunology.

[64]  W. Haseltine,et al.  UV-induced mutation hotspots occur at DNA damage hotspots , 1982, Nature.

[65]  Toshiro Matsuda,et al.  Somatic mutation hotspots correlate with DNA polymerase η error spectrum , 2001, Nature Immunology.

[66]  M Krawczak,et al.  Somatic spectrum of cancer‐associated single basepair substitutions in the TP53 gene is determined mainly by endogenous mechanisms of mutation and by selection , 1995, Human mutation.

[67]  T. Kunkel DNA Replication Fidelity* , 2004, Journal of Biological Chemistry.

[68]  B W Glickman,et al.  Mechanisms of ultraviolet-induced mutation. Mutational spectra in the Escherichia coli lacI gene for a wild-type and an excision-repair-deficient strain. , 1987, Journal of molecular biology.

[69]  Neal F. Cariello,et al.  Databases and software for the analysis of mutations in the human p53 gene, the human hprt gene and both the lacI and lacZ gene in transgenic rodents , 1997, Nucleic Acids Res..

[70]  J. Heddle On clonal expansion and its effects on mutant frequencies, mutation spectra and statistics for somatic mutations in vivo. , 1999, Mutagenesis.

[71]  B. A. Kunz,et al.  Site and strand specificity of UVB mutagenesis in the SUP4-o gene of yeast. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[72]  W. Piegorsch,et al.  Large-sample pairwise comparisons among multinomial proportions with an application to analysis of mutant spectra , 2001 .

[73]  W. H. Day,et al.  Threshold consensus methods for molecular sequences. , 1992, Journal of theoretical biology.

[74]  P. Leong,et al.  DNA base sequence changes induced by ultraviolet light mutagenesis of a gene on a chromosome in Chinese hamster ovary cells. , 1989, Journal of molecular biology.

[75]  N A Kolchanov,et al.  Somatic hypermutagenesis in immunoglobulin genes. III. Somatic mutations in the chicken light chain locus. , 1996, Biochimica et biophysica acta.

[76]  N. Kolchanov,et al.  Somatic hypermutagenesis in immunoglobulin genes. I. Correlation between somatic mutations and repeats. Somatic mutation properties and clonal selection. , 1991, Biochimica et biophysica acta.

[77]  V. V. Solovyov,et al.  Pecularities of immunoglobulin gene structures as a basis for somatic mutation emergence , 1987, FEBS letters.

[78]  F. D. de Serres,et al.  Similarity pattern analysis in mutational distributions. , 1999, Mutation research.

[79]  P. Gearhart,et al.  DNA polymerase η is an A-T mutator in somatic hypermutation of immunoglobulin variable genes , 2001, Nature Immunology.

[80]  J. Lobry Asymmetric substitution patterns in the two DNA strands of bacteria. , 1996, Molecular biology and evolution.

[81]  A. Riggs,et al.  In vivo mapping of a DNA adduct at nucleotide resolution: detection of pyrimidine (6-4) pyrimidone photoproducts by ligation-mediated polymerase chain reaction. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[82]  Heikki Lehväslaiho,et al.  Human Sequence Variation and Mutation Databases , 2000, Briefings Bioinform..

[83]  Thomas B. Kepler,et al.  The Targeting of Somatic Hypermutation Closely Resembles That of Meiotic Mutation1 , 2001, The Journal of Immunology.

[84]  D. Gordenin,et al.  Yeast ARMs (DNA at-risk motifs) can reveal sources of genome instability. , 1998, Mutation research.

[85]  T. Kunkel,et al.  The mutational specificity of DNA polymerase-beta during in vitro DNA synthesis. Production of frameshift, base substitution, and deletion mutations. , 1985, The Journal of biological chemistry.

[86]  J. Jurka,et al.  Integration of retroposable elements in mammals: Selection of target sites , 1996, Journal of Molecular Evolution.

[87]  G. Pfeifer,et al.  Involvement of 5-methylcytosine in sunlight-induced mutagenesis. , 1999, Journal of molecular biology.

[88]  P. Hanawalt Preferential DNA repair in expressed genes. , 1987, Environmental health perspectives.

[89]  David N. Cooper,et al.  The CpG dinucleotide and human genetic disease , 1988, Human Genetics.

[90]  I. Rogozin,et al.  Analysis of phylogenetically reconstructed mutational spectra in human mitochondrial DNA control region , 2002, Human Genetics.

[91]  P. Burns,et al.  Mutational specificity of N-methyl-N-nitrosourea in the lacI gene of Escherichia coli. , 1988, Carcinogenesis.

[92]  T B Kepler,et al.  Statistical inference of sequence-dependent mutation rates. , 2001, Current opinion in genetics & development.

[93]  S. Hess,et al.  The influence of nearest neighbors on the rate and pattern of spontaneous point mutations , 1992, Journal of Molecular Evolution.

[94]  B. Singer,et al.  Deletion formation in bacteriophage T4. , 1988, Journal of molecular biology.

[95]  G. B. Golding,et al.  Patterns of somatic mutations in immunoglobulin variable genes. , 1987, Genetics.

[96]  K Bebenek,et al.  Error rate and specificity of human and murine DNA polymerase eta. , 2001, Journal of molecular biology.

[97]  E. Dervyn,et al.  Frequency of deletion formation decreases exponentially with distance between short direct repeats , 1994, Molecular microbiology.

[98]  Wojciech Makalowski,et al.  Evolutionary conservation and somatic mutation hotspot maps of p53: correlation with p53 protein structural and functional features , 1999, Oncogene.

[99]  J. Wallenburg,et al.  Chromosomal illegitimate recombination in mammalian cells is associated with intrinsically bent DNA elements. , 1992, The EMBO journal.

[100]  M. Seidman,et al.  Mutation spectra in supF: approaches to elucidating sequence context effects. , 2000, Mutation research.

[101]  S. Boiteux,et al.  The Escherichia coli O6-methylguanine-DNA methyltransferase does not repair promutagenic O6-methylguanine residues when present in Z-DNA. , 1985, The Journal of biological chemistry.

[102]  Remo Guidieri Res , 1995, RES: Anthropology and Aesthetics.

[103]  T. Boulikas,et al.  Evolutionary consequences of nonrandom damage and repair of chromatin domains , 1992, Journal of Molecular Evolution.

[104]  J. Witte,et al.  Analysis of mutational spectra: locating hotspots and clusters of mutations using recursive segmentation , 2002, Statistics in medicine.

[105]  D. Cooper,et al.  Human gene mutation in pathology and evolution , 2002, Journal of Inherited Metabolic Disease.

[106]  G. Giglia-Mari,et al.  TP53 mutations in human skin cancers , 2003, Human mutation.

[107]  A. Konopka,et al.  Compilation of DNA strand exchange sites for non-homologous recombination in somatic cells. , 1988, Nucleic acids research.

[108]  M. Neuberger,et al.  AID mutates E. coli suggesting a DNA deamination mechanism for antibody diversification , 2002, Nature.

[109]  D. Lilley In vivo consequences of plasmid topology , 1981, Nature.

[110]  J. Miller Mutational specificity in bacteria. , 1983, Annual review of genetics.

[111]  G. Glazko,et al.  Use of mutation spectra analysis software , 2001, Human mutation.