Performance analysis of different DNA to numerical mapping techniques for identification of protein coding regions using tapered window based short-time discrete Fourier transform

Prior to applying the digital signal processing techniques for identification of protein coding regions, mapping of DNA alphabet into numerical sequences is necessary. In this paper, the performance of existing DNA to numerical mapping techniques is analyzed at the nucleotide level for the identification of protein coding regions using tapered window based short-time discrete Fourier transform (ST-DFT) method applied to the GENSCAN test set. The results show better identification accuracy for Voss, z-curve and tetrahedron mapping technique as compared to other mapping techniques when tapered windows are used for ST-DFT method.

[1]  Paul Levi,et al.  GENIO/scan - EST Guided Identification of Genes in Human Genomic DNA , 1998, German Conference on Bioinformatics.

[2]  Gail L. Rosen,et al.  Signal processing for biologically-inspired gradient source localization and DNA sequence analysis , 2006 .

[3]  R. Linsker,et al.  A measure of DNA periodicity. , 1986, Journal of theoretical biology.

[4]  A. Nair,et al.  A coding measure scheme employing electron-ion interaction pseudopotential (EIIP) , 2006, Bioinformation.

[5]  V. K. Srivastava,et al.  DSP technique for gene and exon prediction taking complex indicator sequence , 2008, TENCON 2008 - 2008 IEEE Region 10 Conference.

[6]  R Zhang,et al.  Z curves, an intutive tool for visualizing and analyzing the DNA sequences. , 1994, Journal of biomolecular structure & dynamics.

[7]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[8]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[9]  P D Cristea Conversion of nucleotides sequences into genomic signals , 2002, Journal of cellular and molecular medicine.

[10]  Jamal Tuqan,et al.  A DSP Approach for Finding the Codon Bias in DNA Sequences , 2008, IEEE Journal of Selected Topics in Signal Processing.

[11]  Wei Wang,et al.  Computing linear transforms of symbolic signals , 2002, IEEE Trans. Signal Process..

[12]  R. M. C. Junior,et al.  Identification of Protein Coding Regions Using the Modified Gabor-Wavelet Transform , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Jamal Tuqan,et al.  Gene Identification Using the Z-Curve Representation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[14]  Roberto Marcondes Cesar Junior,et al.  Identification of Protein Coding Regions Using the Modified Gabor-Wavelet Transform , 2008, IEEE ACM Trans. Comput. Biol. Bioinform..

[15]  E. Ambikairajah,et al.  On DNA Numerical Representations for Period-3 Based Exon Prediction , 2007, 2007 IEEE International Workshop on Genomic Signal Processing and Statistics.

[16]  Leonidas D. Iasemidis,et al.  Autoregressive Modeling and Feature Analysis of DNA Sequences , 2004, EURASIP J. Adv. Signal Process..

[17]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[18]  Jianchang Ning,et al.  Preliminary wavelet analysis of genomic sequences , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[19]  A. W. M. van den Enden,et al.  Discrete Time Signal Processing , 1989 .

[20]  M. N. Shanmukha Swamy,et al.  Analysis of Genomics and Proteomics Using DSP Techniques , 2008, IEEE Transactions on Circuits and Systems I: Regular Papers.

[21]  P. P. Vaidyanathan,et al.  The role of signal-processing concepts in genomics and proteomics , 2004, J. Frankl. Inst..

[22]  S. Tiwari,et al.  Prediction of probable genes by Fourier analysis of genomic sequences , 1997, Comput. Appl. Biosci..

[23]  P.D. Cristea,et al.  Genomic signal processing , 2004, 7th Seminar on Neural Network Applications in Electrical Engineering, 2004. NEUREL 2004. 2004.

[24]  Mahmood Akhtar,et al.  Signal Processing in Sequence Analysis: Advances in Eukaryotic Gene Prediction , 2008, IEEE Journal of Selected Topics in Signal Processing.