Three‐unit semicircles curve: A compact 3D graphical representation of DNA sequences based on classifications of nucleotides

A new three-dimensional graphical representation of DNA sequences, three-unit semicircles (TUS)-curve, which maps a given sequence into a dot sequences embedded in three-unit semicircles, is proposed based on three biclassifications of nucleotides. TUS-curve has the merit of compactness and could avoid the degeneracy and loss of information. The geometrical center of the curve, which indicates the distribution of base frequencies of the corresponding DNA sequence, is extracted and applied to analyze the similarity of various species. Phylogenetic tree of 11 species based on their first exons of β-globin genes showed that the TUS-curve is a powerful tool to get valuable biological information. © 2011 Wiley Periodicals, Inc. Int J Quantum Chem, 2011

[1]  M. Gates A simple way to look at DNA. , 1986, Journal of theoretical biology.

[2]  Jun Wang,et al.  Characteristic Sequences for DNA Primary Sequence , 2002, J. Chem. Inf. Comput. Sci..

[3]  Tianming Wang,et al.  A novel 2D graphical representation of DNA sequences and its application. , 2006, Journal of molecular graphics & modelling.

[4]  Feng-Biao Guo,et al.  Analysis of nucleotide distribution in the genome of Streptomyces coelicolor A3(2) using the Z curve method , 2003, FEBS letters.

[5]  Phylogenetic inference from binary sequences reduced by primary DNA sequences , 2008, Journal of mathematical chemistry.

[6]  Ren Zhang,et al.  ZCURVE_CoV: a new system to recognize protein coding genes in coronavirus genomes, and its applications in analyzing SARS-CoV genomes , 2003, Biochemical and Biophysical Research Communications.

[7]  Milan Randic,et al.  Algorithm for Coding DNA Sequences into "Spectrum-like" and "Zigzag" Representations , 2005, J. Chem. Inf. Model..

[8]  Dejan Plavšić,et al.  A novel unexpected use of a graphical representation of DNA : Graphical alignment of DNA sequences , 2006 .

[9]  Kequan Ding,et al.  A 4D representation of DNA sequences and its application , 2005 .

[10]  Alexandru T Balaban,et al.  Graphical representation of proteins. , 2011, Chemical reviews.

[11]  H. J. Jeffrey Chaos game representation of gene structure. , 1990, Nucleic acids research.

[12]  Jun Wang,et al.  Comparison of protein secondary structures based on backbone dihedral angles. , 2008, Journal of theoretical biology.

[13]  E. Hamori,et al.  H curves, a novel method of representation of nucleotide series especially suited for long DNA sequences. , 1983, The Journal of biological chemistry.

[14]  Milan Randic,et al.  On the Similarity of DNA Primary Sequences , 2000, J. Chem. Inf. Comput. Sci..

[15]  R Zhang,et al.  A Novel Method to Calculate the G+C Content of Genomic DNA Sequences , 2001, Journal of biomolecular structure & dynamics.

[16]  Michael Hackenberg,et al.  IsoFinder: computational prediction of isochores in genome sequences , 2004, Nucleic Acids Res..

[17]  Dejan Plavšić,et al.  Novel 2-D graphical representation of DNA sequences and their numerical characterization , 2003 .

[18]  Feng Gao,et al.  GC-Profile: a web-based tool for visualizing and analyzing the variation of GC content in genomic sequences , 2006, Nucleic Acids Res..

[19]  A. Nandy,et al.  A new graphical representation and analysis of DNA sequence structure. I: Methodology and application to globin genes , 1994 .

[20]  Zhongxi Mo,et al.  Three 3D graphical representations of DNA primary sequences based on the classifications of DNA bases and their applications , 2011, Journal of Theoretical Biology.

[21]  E Hamori Graphic representation of long DNA sequences by the method of H curves--current results and future aspects. , 1989, BioTechniques.

[22]  Renfa Li,et al.  A 3D graphical representation of DNA sequence based on numerical coding method , 2010 .

[23]  R Zhang,et al.  Z curves, an intutive tool for visualizing and analyzing the DNA sequences. , 1994, Journal of biomolecular structure & dynamics.

[24]  Milan Randic,et al.  A novel 2-D graphical representation of DNA sequences of low degeneracy , 2001 .

[25]  Chun Li,et al.  Numerical characterization and similarity analysis of DNA sequences based on 2-D graphical representation of the characteristic sequences. , 2003, Combinatorial chemistry & high throughput screening.

[26]  Patrick W. Fowler,et al.  Pentaheptite Modifications of the Graphite Sheet , 2000, J. Chem. Inf. Comput. Sci..