Three 3D graphical representations of DNA primary sequences based on the classifications of DNA bases and their applications

Abstract In this article, we introduce three 3D graphical representations of DNA primary sequences, which we call RY-curve, MK-curve and SW-curve, based on three classifications of the DNA bases. The advantages of our representations are that (i) these 3D curves are strictly non-degenerate and there is no loss of information when transferring a DNA sequence to its mathematical representation and (ii) the coordinates of every node on these 3D curves have clear biological implication. Two applications of these 3D curves are presented: (a) a simple formula is derived to calculate the content of the four bases (A, G, C and T) from the coordinates of nodes on the curves; and (b) a 12-component characteristic vector is constructed to compare similarity among DNA sequences from different species based on the geometrical centers of the 3D curves. As examples, we examine similarity among the coding sequences of the first exon of beta-globin gene from eleven species and validate similarity of cDNA sequences of beta-globin gene from eight species.

[1]  Yi Pan,et al.  The community structure of human cellular signaling network , 2007, Journal of Theoretical Biology.

[2]  Yi Zhang,et al.  Characterization and similarity analysis of DNA sequences grounded on a 2-D graphical representation☆ , 2006 .

[3]  Tianming Wang,et al.  3-D graphical representation of DNA sequences and their numerical characterization , 2004 .

[4]  Xiao Sun,et al.  TN curve: A novel 3D graphical representation of DNA sequence based on trinucleotides and its applications , 2009, Journal of Theoretical Biology.

[5]  C. Kuo-chen,et al.  FoldRate: A Web-Server for Predicting Protein Folding Rates from Primary Sequence , 2009 .

[6]  Yongsheng Ding,et al.  An application of gene comparative image for predicting the effect on replication ratio by HBV virus gene missense mutation. , 2005, Journal of theoretical biology.

[7]  K C Chou,et al.  Do “antisense proteins” exist? , 1996, Journal of protein chemistry.

[8]  C T Zhang A symmetrical theory of DNA sequences and its applications. , 1997, Journal of theoretical biology.

[9]  J. Andraos Kinetic plasticity and the determination of product ratios for kinetic schemes leading to multiple products without rate laws — New methods based on directed graphs , 2008 .

[10]  K C Chou,et al.  Graphic analysis of codon usage strategy in 1490 human proteins , 1993, Journal of protein chemistry.

[11]  M. Gates A simple way to look at DNA. , 1986, Journal of theoretical biology.

[12]  Chun Li,et al.  On a 3-D representation of DNA primary sequences. , 2004, Combinatorial chemistry & high throughput screening.

[13]  Stephen Wolfram,et al.  Cellular automata as models of complexity , 1984, Nature.

[14]  XiaoChan Tang,et al.  On the similarity/dissimilarity of DNA sequences based on 4D graphical representation , 2010 .

[15]  Dejan Plavšić,et al.  Novel 2-D graphical representation of DNA sequences and their numerical characterization , 2003 .

[16]  K. Chou Graphic rule for drug metabolism systems. , 2010, Current drug metabolism.

[17]  K. Chou,et al.  Two new schematic rules for rate laws of enzyme-catalysed reactions. , 1981, Journal of theoretical biology.

[18]  C. Zhang,et al.  A graphic approach to analyzing codon usage in 1562 Escherichia coli protein coding sequences. , 1994, Journal of molecular biology.

[19]  P. M. Leong,et al.  Random walk and gap plots of DNA sequences , 1995, Comput. Appl. Biosci..

[20]  K. Chou,et al.  Predicting protein structural classes with pseudo amino acid composition: an approach using geometric moments of cellular automaton image. , 2008, Journal of theoretical biology.

[21]  G. Zhou,et al.  An extension of Chou's graphic rules for deriving enzyme kinetic equations to systems involving parallel reaction pathways. , 1984, The Biochemical journal.

[22]  Meng Wang,et al.  A new nucleotide-composition based fingerprint of SARS-CoV with visualization analysis. , 2005, Medicinal chemistry (Shariqah (United Arab Emirates)).

[23]  Tianming Wang,et al.  PNN-curve: a new 2D graphical representation of DNA sequences and its application. , 2006, Journal of theoretical biology.

[24]  E. Uriarte,et al.  Multi-target QPDR classification model for human breast and colon cancer-related proteins using star graph topological indices , 2008, Journal of Theoretical Biology.

[25]  K. Chou Applications of graph theory to enzyme kinetics and protein folding kinetics. Steady and non-steady-state systems. , 2020, Biophysical chemistry.

[26]  Yu-hua Yao,et al.  Analysis of similarity/dissimilarity of DNA sequences based on a 3-D graphical representation , 2005 .

[27]  C. Kuo-chen,et al.  Graphical rules for non-steady state enzyme kinetics. , 1981, Journal of theoretical biology.

[28]  E. Hamori,et al.  H curves, a novel method of representation of nucleotide series especially suited for long DNA sequences. , 1983, The Journal of biological chemistry.

[29]  A. Nandy,et al.  A new graphical representation and analysis of DNA sequence structure. I: Methodology and application to globin genes , 1994 .

[30]  Jiangning Song,et al.  Prediction of protein folding rates from primary sequence by fusing multiple sequential features , 2009 .

[31]  Bo Liao,et al.  A 3D graphical representation of DNA sequences and its application , 2006, Theor. Comput. Sci..

[32]  K. Neet,et al.  Demonstration of a slow conformational change in liver glucokinase by fluorescence spectroscopy. , 1990, The Journal of biological chemistry.

[33]  Kuo-Chen Chou,et al.  GPCR‐CA: A cellular automaton image approach for predicting G‐protein–coupled receptor functional classes , 2009, J. Comput. Chem..

[34]  S. Forsén,et al.  Graphical rules for enzyme-catalysed rate laws. , 1980, The Biochemical journal.

[35]  K. Chou,et al.  2D-MH: A web-server for generating graphic representation of protein sequences based on the physicochemical properties of their constituent amino acids. , 2010, Journal of theoretical biology.

[36]  Lian-peng Zhao,et al.  An S-Curve-Based Approach of Identifying Biological Sequences , 2010, Acta biotheoretica.

[37]  Ren Zhang,et al.  The Z curve database: a graphic representation of genome sequences , 2003, Bioinform..

[38]  Kuo-Chen Chou,et al.  A probability cellular automaton model for hepatitis B viral infections. , 2006, Biochemical and biophysical research communications.

[39]  K C Chou,et al.  Kinetics of processive nucleic acid polymerases and nucleases. , 1994, Analytical biochemistry.

[40]  R Zhang,et al.  Z curves, an intutive tool for visualizing and analyzing the DNA sequences. , 1994, Journal of biomolecular structure & dynamics.

[41]  Chou Kuo-Chen,et al.  GRAPH THEORY OF ENZYME KINETICS I.STEADY-STATE REACTION SYSTEMS , 1979 .

[42]  Z. Huang,et al.  Using cellular automata images and pseudo amino acid composition to predict protein subcellular location , 2005, Amino Acids.

[43]  C. Munteanu,et al.  Generalized lattice graphs for 2D-visualization of biological information , 2009, Journal of Theoretical Biology.

[44]  K. Chou,et al.  A new schematic method in enzyme kinetics. , 2005, European journal of biochemistry.

[45]  C. Zhang,et al.  Diagrammatization of codon usage in 339 human immunodeficiency virus proteins and its biological implication. , 1992, AIDS research and human retroviruses.

[46]  K. Chou,et al.  Graphic rules in steady and non-steady state enzyme kinetics. , 1989, The Journal of biological chemistry.

[47]  Milan Randic,et al.  On 3-D Graphical Representation of DNA Primary Sequences and Their Numerical Characterization , 2000, J. Chem. Inf. Comput. Sci..

[48]  Zhen-De Huang,et al.  A novel fingerprint map for detecting SARS-CoV , 2005, Journal of Pharmaceutical and Biomedical Analysis.

[49]  Zhu-Jin Zhang DV-Curve: a novel intuitive tool for visualizing and analyzing DNA sequences , 2009, Bioinform..

[50]  Zhao-Hui Qi,et al.  New 3D graphical representation of DNA sequence based on dual nucleotides , 2007, Journal of Theoretical Biology.

[51]  K.-C. Chou,et al.  Using cellular automata to generate image representation for biological sequences , 2005, Amino Acids.

[52]  Qi Dai,et al.  Analysis of similarity/dissimilarity of DNA sequences based on a class of 2D graphical representation , 2008, J. Comput. Chem..