Advanced Numerical Representation of DNA Sequences

Swarna bai Arniker , Hon Keung Kwan 2 1 Directorate of Laser Systems, Research Centre Imarat, Hyderabad 500 069, Andhra Pradesh state, India, 2 Department of Electrical and Computer Engineering, University of Windsor, 401 Sunset Avenue, Ontario, Abstract. DNA sequence analysis using digital signal processing methods requires mapping of base sequence to numerical sequence. The choice of the numerical representation of a DNA sequence affects how well its biological properties can be reflected in the numerical domain for the detection and identification of the characteristics of special regions of interest. This paper presents various advanced methods of DNA numerical representation for DNA sequence analysis, their relative merits and demerits, and includes some concluding remarks.

[1]  A. Nandy,et al.  Novel techniques of graphical representation and analysis of DNA sequences—A review , 2007, Journal of Biosciences.

[2]  G. Zhou,et al.  Neural network optimization for E. coli promoter prediction. , 1991, Nucleic acids research.

[3]  P Bernaola-Galván,et al.  Study of statistical correlations in DNA sequences. , 2002, Gene.

[4]  Changchuan Yin,et al.  Numerical representation of DNA sequences based on genetic code context and its applications in periodicity analysis of genomes , 2008, 2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[5]  E. Ambikairajah,et al.  On DNA Numerical Representations for Period-3 Based Exon Prediction , 2007, 2007 IEEE International Workshop on Genomic Signal Processing and Statistics.

[6]  M. Yan,et al.  A new fourier transform approach for protein coding measure based on the format of the Z curve , 1998, Bioinform..

[7]  Sanjit K. Mitra,et al.  Power spectrum analysis for DNA sequences , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..

[8]  Leonidas D. Iasemidis,et al.  Autoregressive Modeling and Feature Analysis of DNA Sequences , 2004, EURASIP J. Adv. Signal Process..

[9]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[10]  C. Zhang,et al.  Recognition of protein coding genes in the yeast genome at better than 95% accuracy based on the Z curve. , 2000, Nucleic acids research.

[11]  Robert J. Robbins,et al.  Informatics and the Human Genome Project , 1995 .

[12]  Mahmood Akhtar,et al.  Signal Processing in Sequence Analysis: Advances in Eukaryotic Gene Prediction , 2008, IEEE Journal of Selected Topics in Signal Processing.

[13]  T. Mahalakshmi,et al.  Visualization Of Genomic Data Using Inter-Nucleotide Distance Signals , 2005 .

[14]  H E Stanley,et al.  Finding borders between coding and noncoding DNA regions by an entropic segmentation method. , 2000, Physical review letters.

[15]  Pietro Liò,et al.  Finding pathogenicity islands and gene transfer events in genome data , 2000, Bioinform..

[16]  Vasile Palade,et al.  A neural network based multi-classifier system for gene identification in DNA sequences , 2004, Neural Computing & Applications.

[17]  E. A. Cheever,et al.  Using signal processing techniques for DNA sequence comparison , 1989, Proceedings of the Fifteenth Annual Northeast Bioengineering Conference.

[18]  Hon Keung Kwan,et al.  Numerical representation of DNA sequences , 2009, 2009 IEEE International Conference on Electro/Information Technology.

[19]  Gail L. Rosen,et al.  Signal processing for biologically-inspired gradient source localization and DNA sequence analysis , 2006 .

[20]  P. Vandergheynst,et al.  Fourier and wavelet transform analysis, a tool for visualizing regular patterns in DNA sequences. , 2000, Journal of theoretical biology.

[21]  Dimitris Anastassiou,et al.  Genomic signal processing , 2001, IEEE Signal Process. Mag..