Context of deletions and insertions in human coding sequences

We studied the dependence of the rate of short deletions and insertions on their contexts using the data on mutations within coding exons at 19 human loci that cause mendelian diseases. We confirm that periodic sequences consisting of three to five or more nucleotides are mutagenic. Mutability of sequences with strongly biased nucleotide composition is also elevated, even when mutations within homonucleotide runs longer than three nucleotides are ignored. In contrast, no elevated mutation rates have been detected for imperfect direct or inverted repeats. Among known candidate contexts, the indel context GTAAGT and regions with purine‐pyrimidine imbalance between the two DNA strands are mutagenic in our sample, and many others are not mutagenic. Data on mutation hot spots suggest two novel contexts that increase the deletion rate. Comprehensive analysis of mutability of all possible contexts of lengths four, six, and eight indicates a substantially elevated deletion rate within YYYTG and similar sequences, which is one of the two contexts revealed by the hot spots. Possible contexts that increase the insertion rate (AT(A/C)(A/C)GCC and TACCRC) and decrease deletion (TATCGC) or insertion (GCGG) rates have also been identified. Two‐thirds of deletions remove a repeat, and over 80% of insertions create a repeat, i.e., they are duplications. Hum Mutat 23:177–185, 2004. Published 2003 Wiley‐Liss, Inc.

[1]  S. Benzer,et al.  ON THE TOPOGRAPHY OF THE GENETIC FINE STRUCTURE. , 1961, Proceedings of the National Academy of Sciences of the United States of America.

[2]  M. Inouye,et al.  Frameshift mutations and the genetic code. This paper is dedicated to Professor Theodosius Dobzhansky on the occasion of his 66th birthday. , 1966, Cold Spring Harbor symposia on quantitative biology.

[3]  J. Drake,et al.  The biochemistry of mutagenesis. , 1976, Annual Review of Biochemistry.

[4]  Philip J. Farabaugh,et al.  Molecular basis of base substitution hotspots in Escherichia coli , 1978, Nature.

[5]  Tom Maniatis,et al.  The structure and evolution of the human β-globin gene family , 1980, Cell.

[6]  A. Albertini,et al.  On the formation of spontaneous deletions: The importance of short sequence homologies in the generation of large deletions , 1982, Cell.

[7]  B. Glickman,et al.  Unique self-complementarity of palindromic sequences provides DNA structural intermediates for mutation. , 1983, Cold Spring Harbor symposia on quantitative biology.

[8]  J. Miller Mutational specificity in bacteria. , 1983, Annual review of genetics.

[9]  G. B. Golding,et al.  Sequence-directed mutagenesis: evidence from a phylogenetic history of human alpha-interferon genes. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Swee Lay Thein,et al.  Hypervariable ‘minisatellite’ regions in human DNA , 1985, Nature.

[11]  R. Schaaper,et al.  Mechanisms of spontaneous mutagenesis: an analysis of the spectrum of spontaneous mutation in the Escherichia coli lacI gene. , 1986, Journal of molecular biology.

[12]  C. Papanicolaou,et al.  Polymerase-specific differences in the DNA intermediates of frameshift mutagenesis. In vitro synthesis errors of Escherichia coli DNA polymerase I and its large fragment derivative. , 1989, Journal of molecular biology.

[13]  L. S. Ripley,et al.  Frameshift mutation: determinants of specificity. , 1990, Annual review of genetics.

[14]  C C Shen,et al.  Specificity and flexibility of the recognition of DNA helical structure by eukaryotic topoisomerase I. , 1990, Journal of molecular biology.

[15]  B W Glickman,et al.  Mutational specificity of alkylating agents and the influence of DNA repair , 1990, Environmental and molecular mutagenesis.

[16]  N A Kolchanov,et al.  Somatic hypermutagenesis in immunoglobulin genes. II. Influence of neighbouring base sequences on mutagenesis. , 1992, Biochimica et biophysica acta.

[17]  B. Michel,et al.  Mechanisms of illegitimate recombination. , 1993, Gene.

[18]  Drena Dobbs,et al.  Modular sequence elements associated with origin regions in eukaryotic chromosomal DNA , 1994, Nucleic Acids Res..

[19]  F. Stahl,et al.  Chi and the RecBC D enzyme of Escherichia coli. , 1994, Annual review of genetics.

[20]  C. Harris,et al.  Deletions and insertions in the p53 tumor suppressor gene in human cancers: confirmation of the DNA polymerase slippage/misalignment model. , 1996, Cancer research.

[21]  H. Kresse,et al.  Mucopolysaccharidosis type II (Hunter syndrome): mutation "hot spots" in the iduronate-2-sulfatase gene. , 1996, American journal of human genetics.

[22]  M. Mitas,et al.  Trinucleotide repeats associated with human disease. , 1997, Nucleic acids research.

[23]  G. Danieli,et al.  Large majority of single‐nucleotide mutations along the dystrophin gene can be explained by more than one mechanism of mutagenesis , 1997, Human mutation.

[24]  Luciano Milanesi,et al.  The subclass approach for mutational spectrum analysis: application of the SEM algorithm. , 1998, Journal of theoretical biology.

[25]  D. Gordenin,et al.  Yeast ARMs (DNA at-risk motifs) can reveal sources of genome instability. , 1998, Mutation research.

[26]  M. Goodman,et al.  Analysis of Strand Slippage in DNA Polymerase Expansions of CAG/CTG Triplet Repeats Associated with Neurodegenerative Disease* , 1998, The Journal of Biological Chemistry.

[27]  M Krawczak,et al.  Neighboring-nucleotide effects on the rates of germ-line single-base-pair substitution in human genes. , 1998, American journal of human genetics.

[28]  P. Hainaut,et al.  Mutation spectra resulting from carcinogenic exposure: from model systems to cancer-related genes. , 1998, Recent results in cancer research. Fortschritte der Krebsforschung. Progres dans les recherches sur le cancer.

[29]  A. Smit Interspersed repeats and other mementos of transposable elements in mammalian genomes. , 1999, Current opinion in genetics & development.

[30]  M. Hadchouel,et al.  Mutations in JAGGED1 gene are predominantly sporadic in Alagille syndrome. , 1999, Gastroenterology.

[31]  B. Strauss,et al.  Frameshift mutation, microsatellites and mismatch repair. , 1999, Mutation research.

[32]  Igor B. Rogozin,et al.  Regression trees for analysis of mutational spectra in nucleotide sequences , 1999, Bioinform..

[33]  R. Hultcrantz,et al.  Epidemiology of familial adenomatous polyposis in Sweden: changes over time and differences in phenotype between males and females. , 1999, Scandinavian journal of gastroenterology.

[34]  R. Sinden,et al.  DNA‐Directed Mutations: Leading and Lagging Strand Specificity , 1999, Annals of the New York Academy of Sciences.

[35]  Haig H. Kazazian,et al.  Mobile elements and the human genome , 2000, Nature Reviews Genetics.

[36]  Luciano Milanesi,et al.  Prediction and Phylogenetic Analysis of Mammalian Short Interspersed Elements (SINEs) , 2000, Briefings Bioinform..

[37]  W. Makałowski,et al.  Genomic scrap yard: how genomes utilize all that junk. , 2000, Gene.

[38]  T. Kunkel,et al.  Streisinger revisited: DNA synthesis errors mediated by substrate misalignments. , 2000, Cold Spring Harbor symposia on quantitative biology.

[39]  Toshiro Matsuda,et al.  Somatic mutation hotspots correlate with DNA polymerase η error spectrum , 2001, Nature Immunology.

[40]  G. Glazko,et al.  Use of mutation spectra analysis software , 2001, Human mutation.

[41]  S. Sommer,et al.  Spontaneous microdeletions and microinsertions in a transgenic mouse mutation detection system: analysis of age, tissue, and sequence specificity , 2001, Environmental and molecular mutagenesis.

[42]  J. Mendell,et al.  Diagnosis of Duchenne dystrophy by enhanced detection of small mutations , 2001, Neurology.

[43]  Jeremy Heil,et al.  Human diallelic insertion/deletion polymorphisms. , 2002, American journal of human genetics.

[44]  Alexey S Kondrashov,et al.  Classification of common conserved sequences in mammalian intergenic regions. , 2002, Human molecular genetics.

[45]  H. Kazazian,et al.  LINE Drive Retrotransposition and Genome Instability , 2002, Cell.

[46]  T. Kunkel,et al.  Correlation of somatic hypermutation specificity and A-T base pair substitution errors by DNA polymerase η during copying of a mouse immunoglobulin κ light chain transgene , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[47]  W. Thilly,et al.  The DNA polymerase beta replication error spectrum in the adenomatous polyposis coli gene contains human colon tumor mutational hotspots. , 2002, Cancer research.

[48]  H. Maki Origins of spontaneous mutations: specificity and directionality of base-substitution, frameshift, and sequence-substitution mutageneses. , 2002, Annual review of genetics.

[49]  V. Torres,et al.  A complete mutation screen of the ADPKD genes by DHPLC. , 2002, Kidney international.

[50]  I. Rogozin,et al.  Theoretical analysis of mutation hotspots and their DNA sequence context specificity. , 2003, Mutation research.

[51]  Michael Krawczak,et al.  Translocation and gross deletion breakpoints in human inherited disease and cancer II: Potential involvement of repetitive sequence elements in secondary structure formation between DNA ends , 2003, Human mutation.

[52]  D. Cooper,et al.  Meta‐analysis of indels causing human genetic disease: mechanisms of mutagenesis and the role of local DNA sequence complexity , 2003, Human mutation.

[53]  R. Britten,et al.  Majority of divergence between closely related DNA samples is due to indels , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Alexey S Kondrashov,et al.  Direct estimates of human per nucleotide mutation rates at 20 loci causing mendelian diseases , 2003, Human mutation.

[55]  Michael Krawczak,et al.  Translocation and gross deletion breakpoints in human inherited disease and cancer I: Nucleotide composition and recombination‐associated motifs , 2003, Human mutation.

[56]  T. Boulikas,et al.  Evolutionary consequences of nonrandom damage and repair of chromatin domains , 1992, Journal of Molecular Evolution.

[57]  David N. Cooper,et al.  The CpG dinucleotide and human genetic disease , 1988, Human Genetics.