TN curve: A novel 3D graphical representation of DNA sequence based on trinucleotides and its applications

Abstract In this paper, a novel 3D graphical representation of DNA sequence based on trinucleotides is proposed. This representation allows direct inspection of composition as well as distribution of trinucleotides in DNA sequence for the first time and avoids loss of information, from which one can obtain more information. Based on this novel model, six numerical descriptors of DNA sequence are deduced without complicated calculations, and the applications in similarities/dissimilarities analysis of coding sequences and conserved genes discrimination illustrate their utilities. In addition, two simple methods for similarities/dissimilarities analysis of coding sequences among different species are exploited by using two vectors composed of 64 and six components, respectively, which can provide convenient sequence alignment tools for both computational scientists and molecular biologists.

[1]  Jun Wang,et al.  Characteristic Sequences for DNA Primary Sequence , 2002, J. Chem. Inf. Comput. Sci..

[2]  Yi Pan,et al.  The community structure of human cellular signaling network , 2007, Journal of Theoretical Biology.

[3]  K. Chou,et al.  Predicting protein structural classes with pseudo amino acid composition: an approach using geometric moments of cellular automaton image. , 2008, Journal of theoretical biology.

[4]  Yuhua Yao,et al.  A new 2D graphical representation—Classification curve and the analysis of similarity/dissimilarity of DNA sequences , 2006 .

[5]  Lourdes Santana,et al.  Proteomics, networks and connectivity indices , 2008, Proteomics.

[6]  R Zhang,et al.  Z curves, an intutive tool for visualizing and analyzing the DNA sequences. , 1994, Journal of biomolecular structure & dynamics.

[7]  Li Yang,et al.  New invariant of DNA sequence based on 3DD‐curves and its application on phylogeny , 2007, J. Comput. Chem..

[8]  K. Chou,et al.  Two new schematic rules for rate laws of enzyme-catalysed reactions. , 1981, Journal of theoretical biology.

[9]  Yongsheng Ding,et al.  An application of gene comparative image for predicting the effect on replication ratio by HBV virus gene missense mutation. , 2005, Journal of theoretical biology.

[10]  Marjan Vracko,et al.  Compact 2-D graphical representation of DNA , 2003 .

[11]  Rafael Bello,et al.  ANN-QSAR model for selection of anticancer leads from structurally heterogeneous series of compounds. , 2007, European journal of medicinal chemistry.

[12]  J. Chou,et al.  Kinetic studies with the non-nucleoside HIV-1 reverse transcriptase inhibitor U-88204E. , 1993, Biochemistry.

[13]  Meng Wang,et al.  A new nucleotide-composition based fingerprint of SARS-CoV with visualization analysis. , 2005, Medicinal chemistry (Shariqah (United Arab Emirates)).

[14]  Tianming Wang,et al.  PNN-curve: a new 2D graphical representation of DNA sequences and its application. , 2006, Journal of theoretical biology.

[15]  Xiaofeng Guo,et al.  Numerical characterization of DNA sequences in a 2-D graphical representation scheme of low degeneracy , 2003 .

[16]  Tianming Wang,et al.  3-D graphical representation of DNA sequences and their numerical characterization , 2004 .

[17]  A. Nandy Novel Method for Discrimination of Conserved Genes through Numerical Characterization of DNA Sequences , 2003 .

[18]  Stephen Wolfram,et al.  Cellular automata as models of complexity , 1984, Nature.

[19]  Renfa Li,et al.  On the Similarity of DNA Primary Sequences Based on 5-D Representation , 2007 .

[20]  Zhao-Hui Qi,et al.  New 3D graphical representation of DNA sequence based on dual nucleotides , 2007, Journal of Theoretical Biology.

[21]  S. Basak,et al.  Mathematical descriptors of DNA sequences: development and applications , 2006 .

[22]  Alan Wee-Chung Liew,et al.  DB-Curve: a novel 2D method of DNA sequence visualization and representation , 2003 .

[23]  Lourdes Santana,et al.  Medicinal chemistry and bioinformatics--current trends in drugs discovery with networks topological indices. , 2007, Current topics in medicinal chemistry.

[24]  K. Chou,et al.  Unified QSAR approach to antimicrobials. Part 3: first multi-tasking QSAR model for input-coded prediction, structural back-projection, and complex networks clustering of antiprotozoal compounds. , 2008, Bioorganic & medicinal chemistry.

[25]  Humberto González-Díaz,et al.  3D-QSAR study for DNA cleavage proteins with a potential anti-tumor ATCUN-like motif. , 2006, Journal of inorganic biochemistry.

[26]  K.-C. Chou,et al.  Using cellular automata to generate image representation for biological sequences , 2005, Amino Acids.

[27]  C. Kuo-chen,et al.  Graphical rules for non-steady state enzyme kinetics. , 1981, Journal of theoretical biology.

[28]  Bo Liao,et al.  Analysis of Similarity/Dissimilarity of DNA Sequences Based on Nonoverlapping Triplets of Nucleotide Bases , 2004, J. Chem. Inf. Model..

[29]  R Zhang,et al.  Analysis of distribution of bases in the coding sequences by a diagrammatic technique. , 1991, Nucleic acids research.

[30]  Milan Randic,et al.  On 3-D Graphical Representation of DNA Primary Sequences and Their Numerical Characterization , 2000, J. Chem. Inf. Comput. Sci..

[31]  E. Hamori,et al.  H curves, a novel method of representation of nucleotide series especially suited for long DNA sequences. , 1983, The Journal of biological chemistry.

[32]  J. Andraos Kinetic plasticity and the determination of product ratios for kinetic schemes leading to multiple products without rate laws — New methods based on directed graphs , 2008 .

[33]  K C Chou,et al.  Graphic analysis of codon usage strategy in 1490 human proteins , 1993, Journal of protein chemistry.

[34]  Dejan Plavšić,et al.  A novel unexpected use of a graphical representation of DNA : Graphical alignment of DNA sequences , 2006 .

[35]  A. Nandy A new graphical representation and analysis of DNA sequence structure. I: Methodology and application to globin genes , 1994 .

[36]  Kuo-Chen Chou,et al.  A probability cellular automaton model for hepatitis B viral infections. , 2006, Biochemical and biophysical research communications.

[37]  Renfa Li,et al.  A group of 3D graphical representation of DNA sequences based on dual nucleotides , 2008 .

[38]  Milan Randic,et al.  A novel 2-D graphical representation of DNA sequences of low degeneracy , 2001 .

[39]  Yu-hua Yao,et al.  Analysis of similarity/dissimilarity of DNA sequences based on a 3-D graphical representation , 2005 .

[40]  Kequan Ding,et al.  Novel 4D numerical representation of DNA sequences , 2005 .

[41]  L. Resnick,et al.  The quinoline U-78036 is a potent inhibitor of HIV-1 reverse transcriptase. , 1993, The Journal of biological chemistry.

[42]  Zhen-De Huang,et al.  A novel fingerprint map for detecting SARS-CoV , 2005, Journal of Pharmaceutical and Biomedical Analysis.

[43]  K C Chou,et al.  Do “antisense proteins” exist? , 1996, Journal of protein chemistry.

[44]  Wei Chen,et al.  Invariants of DNA sequences based on 2DD-curves. , 2006, Journal of theoretical biology.

[45]  Z. Huang,et al.  Using cellular automata images and pseudo amino acid composition to predict protein subcellular location , 2005, Amino Acids.

[46]  Jun Wang,et al.  Characterization and similarity analysis of DNA sequences based on mutually direct-complementary triplets☆ , 2006 .

[47]  Dejan Plavšić,et al.  Novel 2-D graphical representation of DNA sequences and their numerical characterization , 2003 .

[48]  Bo Liao,et al.  Analysis of Similarity / Dissimilarity of DNA Sequences Based on Dual Nucleotides , .

[49]  David Myers,et al.  Microcomputer tools for steady-state enzyme kinetics , 1985, Comput. Appl. Biosci..

[50]  C. Zhang,et al.  A graphic approach to analyzing codon usage in 1562 Escherichia coli protein coding sequences. , 1994, Journal of molecular biology.

[51]  K. Chou Applications of graph theory to enzyme kinetics and protein folding kinetics. Steady and non-steady-state systems. , 2020, Biophysical chemistry.

[52]  Zhao-Hui Qi,et al.  PN-curve: A 3D graphical representation of DNA sequences and their numerical characterization , 2007 .

[53]  Wen Zhu,et al.  A Numerical Representation of DNA Sequences and Its Applications , 2008 .

[54]  C. Zhang,et al.  Diagrammatization of codon usage in 339 human immunodeficiency virus proteins and its biological implication. , 1992, AIDS research and human retroviruses.

[55]  K. Chou Graphic rules in steady and non-steady state enzyme kinetics. , 1989, The Journal of biological chemistry.

[56]  Jie Song A new 3-D graphical representation of DNA sequences and their numerical characterization , 2009, 2009 4th International Conference on Computer Science & Education.

[57]  Bo Liao,et al.  A 3D graphical representation of DNA sequences and its application , 2006, Theor. Comput. Sci..

[58]  Kuo-Chen Chou,et al.  GPCR‐CA: A cellular automaton image approach for predicting G‐protein–coupled receptor functional classes , 2009, J. Comput. Chem..

[59]  Guohua Huang,et al.  H–L curve: A novel 2D graphical representation for DNA sequences , 2008 .

[60]  S. Forsén,et al.  Graphical rules for enzyme-catalysed rate laws. , 1980, The Biochemical journal.

[61]  Milan Randić Graphical representations of DNA as 2-D map , 2004 .

[62]  Ashesh Nandy,et al.  On the uniqueness of quantitative DNA difference descriptors in 2D graphical representation models , 2003 .

[63]  J. Chou,et al.  Steady-state kinetic studies with the non-nucleoside HIV-1 reverse transcriptase inhibitor U-87201E. , 1993, The Journal of biological chemistry.