Prevalence of quadruplexes in the human genome

Guanine-rich DNA sequences of a particular form have the ability to fold into four-stranded structures called G-quadruplexes. In this paper, we present a working rule to predict which primary sequences can form this structure, and describe a search algorithm to identify such sequences in genomic DNA. We count the number of quadruplexes found in the human genome and compare that with the figure predicted by modelling DNA as a Bernoulli stream or as a Markov chain, using windows of various sizes. We demonstrate that the distribution of loop lengths is significantly different from what would be expected in a random case, providing an indication of the number of potentially relevant quadruplex-forming sequences. In particular, we show that there is a significant repression of quadruplexes in the coding strand of exonic regions, which suggests that quadruplex-forming patterns are disfavoured in sequences that will form RNA.

[1]  F. Quadrifoglio,et al.  G-rich oligonucleotide inhibits the binding of a nuclear protein to the Ki-ras promoter and strongly reduces cell growth in human carcinoma pancreatic cells. , 2004, Biochemistry.

[2]  G. Parkinson,et al.  A thymine tetrad in d(TGGGGT) quadruplexes stabilized with Tl+/Na+ ions. , 2004, Nucleic acids research.

[3]  I Berger,et al.  In vitro generated antibodies specific for telomeric guanine-quadruplex DNA react with Stylonychia lemnae macronuclei , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  L. Hurley,et al.  Effect of DNA secondary structure on human telomerase activity. , 1998, Biochemistry.

[5]  K. Usdin,et al.  Tetraplex formation by the progressive myoclonus epilepsy type‐1 repeat: implications for instability in the repeat expansion diseases , 2001, FEBS letters.

[6]  V A Zakian,et al.  Structure and function of telomeres. , 1989, Annual review of genetics.

[7]  S. Balasubramanian,et al.  Selection of zinc fingers that bind single-stranded telomeric DNA in the G-quadruplex conformation. , 2001, Biochemistry.

[8]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[9]  D. Bearss,et al.  Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  M. Bansal,et al.  Hairpin and parallel quartet structures for telomeric sequences. , 1992, Nucleic acids research.

[11]  L. Hurley,et al.  The dynamic character of the G-quadruplex element in the c-MYC promoter and modification by TMPyP4. , 2004, Journal of the American Chemical Society.

[12]  Janez Plavec,et al.  Small change in a G-rich sequence, a dramatic change in topology: new dimeric G-quadruplex folding motif with unique loop orientations. , 2003, Journal of the American Chemical Society.

[13]  L. Loeb,et al.  The fragile X syndrome d(CGG)n nucleotide repeats form a stable tetrahelical structure. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[14]  K. Woodford,et al.  The Mouse Ms6-hm Hypervariable Microsatellite Forms a Hairpin and Two Unusual Tetraplexes* , 1998, The Journal of Biological Chemistry.

[15]  M. Sundaralingam,et al.  Crystal structure of a bulged RNA tetraplex at 1.1 a resolution: implications for a novel binding site in RNA tetraplex. , 2003, Structure.

[16]  E. Vermaas,et al.  Selection of single-stranded DNA molecules that bind and inhibit human thrombin , 1992, Nature.

[17]  F. Nielsen,et al.  A guanosine quadruplex and two stable hairpins flank a major cleavage site in insulin-like growth factor II mRNA. , 1994, Nucleic acids research.

[18]  D. Thiele,et al.  Four-stranded nucleic acid structures 25 years later: from guanosine gels to telomer DNA. , 1990, Journal of biomolecular structure & dynamics.

[19]  Stephen Neidle,et al.  Loop-length-dependent folding of G-quadruplexes. , 2004, Journal of the American Chemical Society.

[20]  M. Orozco,et al.  Four-stranded DNA structure stabilized by a novel G:C:A:T tetrad. , 2003, Journal of the American Chemical Society.

[21]  D. Patel,et al.  Solution structure of the human telomeric repeat d[AG3(T2AG3)3] G-tetraplex. , 1993, Structure.

[22]  R. Shafer,et al.  Effect of loop sequence and size on DNA aptamer stability. , 2000, Biochemistry.

[23]  A. Phan,et al.  Propeller-type parallel-stranded G-quadruplexes in the human c-myc promoter. , 2004, Journal of the American Chemical Society.

[24]  Gary Parkinson,et al.  Telomere maintenance as a target for anticancer drug discovery , 2002, Nature Reviews Drug Discovery.

[25]  Jean-Louis Mergny,et al.  Following G‐quartet formation by UV‐spectroscopy , 1998, FEBS letters.

[26]  S. Neidle,et al.  Highly prevalent putative quadruplex sequence motifs in human DNA , 2005, Nucleic acids research.

[27]  P. Patel,et al.  NMR observation of T-tetrads in a parallel stranded DNA quadruplex formed by Saccharomyces cerevisiae telomere repeats. , 1999, Nucleic acids research.

[28]  M. Webba da Silva Association of DNA quadruplexes through G:C:G:C tetrads. Solution structure of d(GCGGTGGAT). , 2003, Biochemistry.

[29]  Stephen Neidle,et al.  Crystal structure of parallel quadruplexes from human telomeric DNA , 2002, Nature.

[30]  Mateus Webba Da Silva Association of DNA quadruplexes through G:C:G:C tetrads. Solution structure of d(GCGGTGGAT). , 2003 .

[31]  Rodger Staden,et al.  Methods for calculating the probabilities of finding patterns in sequences , 1989, Comput. Appl. Biosci..

[32]  N. Maizels,et al.  The Bloom’s Syndrome Helicase Unwinds G4 DNA* , 1998, The Journal of Biological Chemistry.

[33]  D. Rhodes,et al.  The yeast telomere‐binding protein RAP1 binds to and promotes the formation of DNA quadruplexes in telomeric DNA. , 1994, The EMBO journal.

[34]  R. Moyzis,et al.  Structure-function correlations of the insulin-linked polymorphic region. , 1996, Journal of molecular biology.

[35]  Haiyong Han,et al.  The cationic porphyrin TMPyP4 down-regulates c-MYC and human telomerase reverse transcriptase expression and inhibits tumor growth in vivo. , 2002, Molecular cancer therapeutics.

[36]  Roger A. Jones,et al.  Solution structure of the biologically relevant G-quadruplex element in the human c-MYC promoter. Implications for G-quadruplex stabilization. , 2005, Biochemistry.

[37]  L. Loeb,et al.  Human Werner Syndrome DNA Helicase Unwinds Tetrahelical Structures of the Fragile X Syndrome Repeat Sequence d(CGG) n * , 1999, The Journal of Biological Chemistry.

[38]  Richard Durbin,et al.  Method for Calculation of Probability of Matching a Bounded Regular Expression in a Random Data String , 1995, J. Comput. Biol..

[39]  P. Pečinka,et al.  DNA tetraplex formation in the control region of c-myc. , 1998, Nucleic acids research.

[40]  L. Chapman,et al.  Promotion of parallel DNA quadruplexes by a yeast telomere binding protein: a circular dichroism study. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Tom Brown,et al.  High throughput measurement of duplex, triplex and quadruplex melting curves using molecular beacons and a LightCycler. , 2002, Nucleic acids research.

[42]  S. Balasubramanian,et al.  Formation of an interlocked quadruplex dimer by d(GGGT). , 2004, Journal of the American Chemical Society.

[43]  M. Katahira,et al.  An intramolecular quadruplex of (GGA)(4) triplet repeat DNA with a G:G:G:G tetrad and a G(:A):G(:A):G(:A):G heptad, and its dimeric interaction. , 2001, Journal of molecular biology.

[44]  D. Davies,et al.  Helix formation by guanylic acid. , 1962, Proceedings of the National Academy of Sciences of the United States of America.

[45]  R Nussinov,et al.  Nearest neighbor nucleotide patterns. Structural and biological implications. , 1981, The Journal of biological chemistry.

[46]  E. Gilson,et al.  Natural and pharmacological regulation of telomerase. , 2002, Nucleic acids research.

[47]  Keith R Fox,et al.  Influence of loop size on the stability of intramolecular DNA quadruplexes. , 2004, Nucleic acids research.

[48]  P. Fojtík,et al.  The guanine-rich fragile X chromosome repeats are reluctant to form tetraplexes. , 2004, Nucleic acids research.

[49]  S. Karlin,et al.  Over- and under-representation of short oligonucleotides in DNA sequences. , 1992, Proceedings of the National Academy of Sciences of the United States of America.