Similarity and dissimilarity in correlations of genomic DNA

We analyze auto-correlations of human chromosomes 1–22 and rice chromosomes 1–12 for seven binary mapping rules and find that the correlation patterns are different for different rules but almost identical for all of the chromosomes, despite their varying lengths and gc contents. We propose a simple stochastic process for modeling these correlations, and we find that the proposed process can reproduce, quantitatively and qualitatively, the correlation patterns found in the genomes of human and rice.

[1]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[2]  Ivo Grosse,et al.  Repeats and correlations in human DNA sequences. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Eugene V. Koonin,et al.  Power Laws, Scale-Free Networks and Genome Biology , 2006 .

[4]  P. Bernaola-Galván,et al.  Compositional segmentation and long-range fractal correlations in DNA sequences. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[5]  M. Ya. Azbel,et al.  Random Two-Component One-Dimensional Ising Model for Heteropolymer Melting , 1973 .

[6]  Ivo Grosse,et al.  Fractionally integrated process with power-law correlations in variables and magnitudes. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  S. Salzberg,et al.  Microbial gene identification using interpolated Markov models. , 1998, Nucleic acids research.

[8]  Ivo Grosse,et al.  Power-law correlated processes with asymmetric distributions. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  E N Trifonov,et al.  The multiple codes of nucleotide sequences. , 1989, Bulletin of mathematical biology.

[10]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[11]  Wentian Li,et al.  Universal 1/f noise, crossovers of scaling exponents, and chromosome-specific patterns of guanine-cytosine content in DNA sequences of the human genome. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  C. Granger,et al.  AN INTRODUCTION TO LONG‐MEMORY TIME SERIES MODELS AND FRACTIONAL DIFFERENCING , 1980 .

[13]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[14]  Mark Borodovsky,et al.  GENMARK: Parallel Gene Recognition for Both DNA Strands , 1993, Comput. Chem..

[15]  K. Pearson,et al.  Biometrika , 1902, The American Naturalist.

[16]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[17]  I Grosse,et al.  Statistical analysis of the DNA sequence of human chromosome 22. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Wyeth W. Wasserman,et al.  TFBS: Computational framework for transcription factor binding site analysis , 2002, Bioinform..

[19]  A L Goldberger,et al.  Correlation approach to identify coding regions in DNA sequences. , 1994, Biophysical journal.

[20]  M. Borodovsky,et al.  Detection of new genes in a bacterial genome using Markov models for three gene classes. , 1995, Nucleic acids research.

[21]  I. Grosse,et al.  MEASURING CORRELATIONS IN SYMBOL SEQUENCES , 1995 .

[22]  C. Peng,et al.  Mosaic organization of DNA nucleotides. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[23]  Wentian Li,et al.  Long-range correlation and partial 1/fα spectrum in a noncoding DNA sequence , 1992 .

[24]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[25]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[26]  C. Peng,et al.  Long-range correlations in nucleotide sequences , 1992, Nature.