An unusual 500, 000 bases long oscillation of guanine and cytosine content in human chromosome 21

An oscillation with a period of around 500 kb in guanine and cytosine content (GC%) is observed in the DNA sequence of human chromosome 21. This oscillation is localized in the rightmost one-eighth region of the chromosome, from 43.5 Mb to 46.5 Mb. Five cycles of oscillation are observed in this region with six GC-rich peaks and five GC-poor valleys. The GC-poor valleys comprise regions with low density of CpG islands and, alternating between the two DNA strands, low gene density regions. Consequently, the long-range oscillation of GC% result in spacing patterns of both CpG island density, and to a lesser extent, gene densities.

[1]  A. Smit Interspersed repeats and other mementos of transposable elements in mammalian genomes. , 1999, Current opinion in genetics & development.

[2]  L. Peltonen,et al.  An autosomal locus causing autoimmune disease: autoimmune polyglandular disease type I assigned to chromosome 21 , 1994, Nature Genetics.

[3]  Michael Hackenberg,et al.  IsoFinder: computational prediction of isochores in genome sequences , 2004, Nucleic Acids Res..

[4]  M. Hattori,et al.  The DNA sequence of human chromosome 21 , 2000, Nature.

[5]  T. Noda,et al.  On the role of periodism in the origin of proteins. , 2002, Journal of molecular biology.

[6]  Ivo Grosse,et al.  Repeats and correlations in human DNA sequences. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[8]  Shigehiko Kanaya,et al.  Periodicity in prokaryotic and eukaryotic genomes identified by power spectrum analysis. , 2002, Gene.

[9]  Y Sakaki,et al.  The DNA sequence of human chromosome 21. , 2000, Nature.

[10]  M J Shulman,et al.  The coding function of nucleotide sequences can be discerned by statistical analysis. , 1981, Journal of theoretical biology.

[11]  E. Trifonov,et al.  The pitch of chromatin DNA is reflected in its nucleotide sequence. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[12]  L. Peltonen,et al.  An autoimmune disease, APECED, caused by mutations in a novel gene featuring two PHD-type zinc-finger domains , 1997, Nature Genetics.

[13]  J. Rossier,et al.  The 200-kb segmental duplication on human chromosome 21 originates from a pericentromeric dissemination involving human chromosomes 2, 18 and 13. , 2003, Gene.

[14]  Roderic Guigó,et al.  DNA Composition, Codon Usage and Exon Prediction , 1997 .

[15]  G Bernardi,et al.  Misunderstandings about isochores. Part 1. , 2001, Gene.

[16]  Ivo Grosse,et al.  Applications of Recursive Segmentation to the Analysis of DNA Sequences , 2002, Comput. Chem..

[17]  E N Trifonov,et al.  Sequence periodicity in complete genomes of archaea suggests positive supercoiling. , 1998, Journal of biomolecular structure & dynamics.

[18]  H. Prydz,et al.  CpG islands as gene markers in the human genome. , 1992, Genomics.

[19]  Wentian Li,et al.  Long-range correlation and partial 1/fα spectrum in a noncoding DNA sequence , 1992 .

[20]  P Bernaola-Galván,et al.  Isochore chromosome maps of eukaryotic genomes. , 2001, Gene.

[21]  V. Zhurkin,et al.  Periodicity in DNA primary structure is defined by secondary structure of the coded protein. , 1981, Nucleic acids research.

[22]  A. Nekrutenko,et al.  Assessment of compositional heterogeneity within and between eukaryotic genomes. , 2000, Genome research.

[23]  J. Widom Short-range order in two eukaryotic genomes: relation to chromosome structure. , 1996, Journal of molecular biology.

[24]  Wentian Li,et al.  Universal 1/f noise, crossovers of scaling exponents, and chromosome-specific patterns of guanine-cytosine content in DNA sequences of the human genome. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  A. D. McLachlan,et al.  Codon preference and its use in identifying protein coding regions in long DNA sequences , 1982, Nucleic Acids Res..

[26]  G Bernardi,et al.  An approach to the organization of eukaryotic genomes at a macromolecular level. , 1976, Journal of molecular biology.

[27]  D. Arquès,et al.  Periodicities in introns. , 1987, Nucleic acids research.

[28]  P. Baldi,et al.  Naturally occurring nucleosome positioning signals in human exons and introns. , 1996, Journal of molecular biology.

[29]  Masaru Tomita,et al.  ApA Dinucleotide Periodicity in Prokaryote, Eukaryote, and Organelle Genomes , 1999, Journal of Molecular Evolution.

[30]  G. Bernardi,et al.  Similar integration but different stability of Alus and LINEs in the human genome. , 2001, Gene.

[31]  Mark Borodovsky,et al.  GENMARK: Parallel Gene Recognition for Both DNA Strands , 1993, Comput. Chem..

[32]  O. Clay Standard deviations and correlations of GC levels in DNA sequences. , 2001, Gene.

[33]  Shinsei Minoshima,et al.  Positional cloning of the APECED gene , 1997, Nature Genetics.

[34]  G Bernardi,et al.  Compositional heterogeneity within and among isochores in mammalian genomes. I. CsCl and sequence analyses. , 2001, Gene.

[35]  A K Konopka,et al.  Distance analysis helps to establish characteristic motifs in intron sequences. , 1987, Gene analysis techniques.

[36]  Sabino Liuni,et al.  Detection of latent sequence periodicities , 1990, Nucleic Acids Res..

[37]  S. Buldyrev,et al.  Species independence of mutual information in coding and noncoding DNA. , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[38]  G. Bernardi,et al.  Using analytical ultracentrifugation to study compositional variation in vertebrate genomes , 2003, European Biophysics Journal.

[39]  Giorgio Bernardi,et al.  Localization of the gene-richest and the gene-poorest isochores in the interphase nuclei of mammals and birds. , 2002, Gene.

[40]  S. Tiwari,et al.  Prediction of probable genes by Fourier analysis of genomic sequences , 1997, Comput. Appl. Biosci..

[41]  J. Fickett Recognition of protein coding regions in DNA sequences. , 1982, Nucleic acids research.

[42]  E. Trifonov 3-, 10.5-, 200- and 400-base periodicities in genome sequences , 1998 .