A representation of DNA primary sequences by random walk.

We describe the DNA primary sequences by random walk. With the description, two random sequences {Y(m)} and {X(n)} corresponding to a DNA sequence, as well as graphical representations of DNA sequences are given. We further prove that two random sequences {Y(m)} and {X(n)} are both Markov chains. Based on transition probability distributions of Markov chains, some numerical characterizations of random sequences, we introduce some new invariants for the DNA primary sequences. Then using these invariants, we make comparisons among primary sequences for exon 1 of beta-globin genes belonging to nine species for analysis of the similarity and dissimilarity.

[1]  A. Nandy Two-dimensional graphical representation of DNA sequences and intron-exon discrimination in intron-rich sequences , 1996, Comput. Appl. Biosci..

[2]  P. M. Leong,et al.  Random walk and gap plots of DNA sequences , 1995, Comput. Appl. Biosci..

[3]  Emmanuel Bacry,et al.  Wavelet based fractal analysis of DNA sequences , 1996 .

[4]  Ping-an He,et al.  Numerical Characterization of DNA Primary Sequence , 2002 .

[5]  M. A. GATES,et al.  Simpler DNA sequence representations , 1985, Nature.

[6]  E. Hamori,et al.  H curves, a novel method of representation of nucleotide series especially suited for long DNA sequences. , 1983, The Journal of biological chemistry.

[7]  A. Nandy,et al.  A new graphical representation and analysis of DNA sequence structure. I: Methodology and application to globin genes , 1994 .

[8]  Bo Liao,et al.  New 3D graphical representation of DNA sequences and their numerical characterization , 2003 .

[9]  I. Gutman,et al.  On the distribution of π-electrons in large polycyclic aromatic hydrocarbons , 2004 .

[10]  Dejan Plavšić,et al.  Novel 2-D graphical representation of DNA sequences and their numerical characterization , 2003 .

[11]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[12]  C. Peng,et al.  Long-range correlations in nucleotide sequences , 1992, Nature.

[13]  Ashesh Nandy,et al.  On the uniqueness of quantitative DNA difference descriptors in 2D graphical representation models , 2003 .

[14]  Dejan Plavšić,et al.  Analysis of similarity/dissimilarity of DNA sequences based on novel 2-D graphical representation , 2003 .

[15]  Bo Liao,et al.  Analysis of Similarity/Dissimilarity of DNA Sequences Based on Nonoverlapping Triplets of Nucleotide Bases , 2004, J. Chem. Inf. Model..

[16]  Tsonis,et al.  Wavelet analysis of DNA sequences. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[17]  R Zhang,et al.  Z curves, an intutive tool for visualizing and analyzing the DNA sequences. , 1994, Journal of biomolecular structure & dynamics.

[18]  Milan Randic,et al.  On A Four-Dimensional Representation of DNA Primary Sequences , 2003, J. Chem. Inf. Comput. Sci..

[19]  Bo Liao,et al.  Analysis of similarity/dissimilarity of DNA sequences based on 3-D graphical representation , 2004 .

[20]  E. Bacry,et al.  Nucleotide composition effects on the long-range correlations in human genes , 1998 .

[21]  P. Vandergheynst,et al.  Fourier and wavelet transform analysis, a tool for visualizing regular patterns in DNA sequences. , 2000, Journal of theoretical biology.

[22]  Y. M. Zhou,et al.  Informational parameters of nucleic acid and molecular evolution. , 1988, Journal of theoretical biology.

[23]  Milan Randic,et al.  On 3-D Graphical Representation of DNA Primary Sequences and Their Numerical Characterization , 2000, J. Chem. Inf. Comput. Sci..