Compositional heterogeneity within and among isochores in mammalian genomes. I. CsCl and sequence analyses.

GC level distributions of a species' nuclear genome, or of its compositional fractions, encode key information on structural and functional properties of the genome and on its evolution. They can be calculated either from absorbance profiles of the DNA in CsCl density gradients at sedimentation equilibrium, or by scanning long contigs of largely sequenced genomes. In the present study, we address the quantitative characterization of the compositional heterogeneity of genomes, as measured by the GC distributions of fixed-length fragments. Special attention is given to mammalian genomes, since their compartmentalization into isochores implies two levels of heterogeneity, intra-isochore (local) and inter-isochore (global). This partitioning is a natural one, since large-scale compositional properties vary much more among isochores than within them. Intra-isochore GC distributions become roughly Gaussian for long fragments, and their standard deviations decrease only slowly with increasing fragment length, unlike random sequences. This effect can be explained by 'long-range' correlations, often overlooked, that are present along isochores.

[1]  A. Ciccodicola,et al.  Long-range sequence analysis in Xq28: thirteen known and six candidate genes in 219.4 kb of high GC DNA between the RCP/GCP and G6PD loci. , 1996, Human molecular genetics.

[2]  H. Yamagishi Nucleotide distribution in the DNA of Escherichia coli. , 1970, Journal of molecular biology.

[3]  Maurice G. Kendall,et al.  Studies in the history of statistics and probability: A series of papers , 1972 .

[4]  P Bernaola-Galván,et al.  Isochore chromosome maps of eukaryotic genomes. , 2001, Gene.

[5]  G Bernardi,et al.  Isochores and the evolutionary genomics of vertebrates. , 2000, Gene.

[6]  G. Bernardi,et al.  A compositional map of the cen-q21 region of human chromosome 21. , 1997, Gene.

[7]  O. Clay Standard deviations and correlations of GC levels in DNA sequences. , 2001, Gene.

[8]  P. Doty,et al.  Heterogeneity in Deoxyribonucleic Acids: I. Dependence on Composition of the Configurational Stability of Deoxyribonucleic Acids , 1959, Nature.

[9]  N. Sueoka On the genetic basis of variation and heterogeneity of DNA base composition. , 1962, Proceedings of the National Academy of Sciences of the United States of America.

[10]  G. Bernardi,et al.  Organization of nucleotide sequences in the chicken genome. , 1983, European journal of biochemistry.

[11]  P. Bernaola-Galván,et al.  Compositional segmentation and long-range fractal correlations in DNA sequences. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[12]  G Bernardi,et al.  The mosaic genome of warm-blooded vertebrates. , 1985, Science.

[13]  Wentian Li,et al.  Spatial 1/f spectra in open dynamical systems , 1989 .

[14]  G Bernardi,et al.  Compositional heterogeneity within and among isochores in mammalian genomes. II. Some general comments. , 2001, Gene.

[15]  J. Vinograd,et al.  THE DETERMINATION OF DENSITY DISTRIBUTIONS AND DENSITY GRADIENTS IN BINARY SOLUTIONS AT EQUILIBRIUM IN THE ULTRACENTRIFUGE1 , 1961 .

[16]  G. Bernardi,et al.  Diversity and phylogenetic implications of CsCl profiles from rodent DNAs. , 2000, Molecular phylogenetics and evolution.

[17]  D Häring,et al.  No isochores in the human chromosomes 21 and 22? , 2001, Biochemical and biophysical research communications.

[18]  G. Bernardi,et al.  The distribution of genes in the Drosophila genome. , 2000, Gene.

[19]  R. Plackett Studies in the History of Statistics and Probability , 1971 .

[20]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[21]  G Bernardi,et al.  The major components of the mouse and human genomes. 1. Preparation, basic properties and compositional heterogeneity. , 1981, European journal of biochemistry.

[22]  A. Nekrutenko,et al.  Assessment of compositional heterogeneity within and between eukaryotic genomes. , 2000, Genome research.

[23]  W Li,et al.  Delineating relative homogeneous G+C domains in DNA sequences. , 2001, Gene.

[24]  G Bernardi,et al.  Human coding and noncoding DNA: compositional correlations. , 1996, Molecular phylogenetics and evolution.

[25]  Jan Beran,et al.  Statistics for long-memory processes , 1994 .

[26]  G. Bernardi,et al.  Gene distribution and nucleotide sequence organization in the mouse genome. , 1986, European journal of biochemistry.

[27]  U. K. Laemmli,et al.  Metaphase chromosome structure: Bands arise from a differential folding path of the highly AT-rich scaffold , 1994, Cell.

[28]  G. Bernardi,et al.  Similar integration but different stability of Alus and LINEs in the human genome. , 2001, Gene.

[29]  W Li,et al.  Compositional heterogeneity within, and uniformity between, DNA sequences of yeast chromosomes. , 1998, Genome research.

[30]  H. Fujita Mathematical theory of sedimentation analysis , 1962 .

[31]  Wentian Li,et al.  The Study of Correlation Structures of DNA Sequences: A Critical Review , 1997, Comput. Chem..

[32]  G Bernardi,et al.  The gene distribution of the human genome. , 1996, Gene.

[33]  G Bernardi,et al.  An approach to the organization of eukaryotic genomes at a macromolecular level. , 1976, Journal of molecular biology.

[34]  S. Karlin,et al.  Applications and statistics for multiple high-scoring segments in molecular sequences. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[35]  G Bernardi,et al.  An analysis of eukaryotic genomes by density gradient centrifugation. , 1976, Journal of molecular biology.

[36]  P. Doty,et al.  Determination of the base composition of deoxyribonucleic acid from its buoyant density in CsCl. , 1962, Journal of molecular biology.

[37]  G Bernardi,et al.  The major components of the mouse and human genomes. 2. Reassociation kinetics. , 1981, European journal of biochemistry.

[38]  P. Doty,et al.  Heterogeneity in Deoxyribonucleic Acids: II. Dependence of the Density of Deoxyribonucleic Acids on Guanine–Cytosine Content , 1959, Nature.

[39]  G. Bernardi,et al.  The isochores in human chromosomes 21 and 22. , 2001, Biochemical and biophysical research communications.

[40]  G. Bernardi,et al.  THE DNA components of the chicken genome. , 1979, European journal of biochemistry.

[41]  I. Takahashi,et al.  Heterogeneity in nucleotide composition of Bacillus subtilis DNA. , 1971, Journal of molecular biology.

[42]  C. Schmid,et al.  Sedimentation equilibrium of DNA samples heterogeneous in density , 1972, Biopolymers.

[43]  C Saccone,et al.  Influence of base composition on quantitative estimates of gene evolution. , 1990, Methods in enzymology.

[44]  Li,et al.  Expansion-modification systems: A model for spatial 1/f spectra. , 1991, Physical review. A, Atomic, molecular, and optical physics.

[45]  K. Pearson III. Contributions to the mathematical theory of evolution , 1894, Proceedings of the Royal Society of London.

[46]  W. Willinger,et al.  ESTIMATORS FOR LONG-RANGE DEPENDENCE: AN EMPIRICAL STUDY , 1995 .

[47]  M. Hattori,et al.  The DNA sequence of human chromosome 21 , 2000, Nature.

[48]  M Meselson,et al.  THE RELATIVE HOMOGENEITY OF MICROBIAL DNA. , 1959, Proceedings of the National Academy of Sciences of the United States of America.

[49]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[50]  G. Bernardi,et al.  Isolation and characterization of mouse and guinea pig satellite deoxyribonucleic acids. , 1968, Biochemistry.

[51]  A Ando,et al.  A boundary of long-range G + C% mosaic domains in the human MHC locus: pseudoautosomal boundary-like sequence exists near the boundary. , 1995, Genomics.

[52]  C. Peng,et al.  Long-range correlations in nucleotide sequences , 1992, Nature.

[53]  Sally Floyd,et al.  Wide area traffic: the failure of Poisson modeling , 1995, TNET.

[54]  K. Pearson Contributions to the Mathematical Theory of Evolution , 1894 .

[55]  J. Kirk Effect of methylation of cytosine residues on the buoyant density of DNA in caesium chloride solution. , 1967, Journal of molecular biology.

[56]  P. Munson,et al.  DNA correlations , 1992, Nature.

[57]  G. Bernardi,et al.  Gene distribution and nucleotide sequence organization in the human genome. , 1986, European journal of biochemistry.

[58]  G Bernardi,et al.  An analysis of the bovine genome by Cs2SO4-Ag density gradient centrifugation. , 1973, Journal of molecular biology.

[59]  T Gojobori,et al.  Precise switching of DNA replication timing in the GC content transition area in the human major histocompatibility complex , 1997, Molecular and cellular biology.

[60]  S. Karlin,et al.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[61]  C. Schmid,et al.  Molecular weights of homogeneous coliphage DNA's from density-gradient sedimentation equilibrium. , 1969, Journal of molecular biology.