Polyphase filtering with variable mapping rule in protein coding region prediction

Genomic researches are concerned with the study of genomes of organisms. It has become a challenge to the researchers to identify the segments within the DNA sequence that involved in protein synthesis and called coding region of gene. The methods are generally used to identify the segment that relies on period-3 property of genes. This period-3 property easily can be identified by digital signal processing with great accuracy. Prior to DSP application in gene prediction a conversion rule is required which converts symbolic DNA (ATCGTC…) sequence into numerical representations. Accuracy of gene prediction depends on mapping rule. The effectiveness of mapping rule depends on the application area of genomics. Some mapping rule works well in gene prediction may not performed good in genetic disease prediction. Most of the available conversion rules are fixed mapping technique. In this paper a new conversion rule is proposed prior to DSP application and a polyphase filter is used to suppress the noise in the DNA spectrum. The performance of the proposed mapping is compared with existing mapping and also the performance of the polyphase filtering method is compared with existing filtering methods in terms of signal to noise ratio (SNR) and location accuracy.

[1]  R Zhang,et al.  Z curves, an intutive tool for visualizing and analyzing the DNA sequences. , 1994, Journal of biomolecular structure & dynamics.

[2]  Leonidas D. Iasemidis,et al.  Autoregressive Modeling and Feature Analysis of DNA Sequences , 2004, EURASIP J. Adv. Signal Process..

[3]  M. Roy,et al.  Performance analysis and simulation of IIR anti-notch filter with various structures for gene prediction application , 2012, 2012 5th International Conference on Computers and Devices for Communication (CODEC).

[4]  M. Roy,et al.  Spectral analysis of coding and non-coding regions of a DNA sequence by Parametric method , 2010, 2010 Annual IEEE India Conference (INDICON).

[5]  M. Omair Ahmad,et al.  Prediction of protein-coding regions in DNA sequences using a model-based approach , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[6]  M. Roy,et al.  Identification and analysis of coding and non-coding regions of a DNA sequence by positional frequency distribution of nucleotides (PFDN) algorithm , 2009, 2009 4th International Conference on Computers and Devices for Communication (CODEC).

[7]  S. Tiwari,et al.  Prediction of probable genes by Fourier analysis of genomic sequences , 1997, Comput. Appl. Biosci..

[8]  R. Sivakumar,et al.  Improved Algorithm for Analysis of DNA Sequences Using Multiresolution Transformation , 2015, TheScientificWorldJournal.

[9]  Soma Barman,et al.  Identification of Protein Coding Region of DNA Sequence Using Multirate Filter , 2015 .

[10]  P. P. Vaidyanathan,et al.  Multirate digital filters, filter banks, polyphase networks, and applications: a tutorial , 1990, Proc. IEEE.

[11]  Todd Holden,et al.  ATCG nucleotide fluctuation of Deinococcus radiodurans radiation genes , 2007, SPIE Optical Engineering + Applications.

[12]  J. Fickett Recognition of protein coding regions in DNA sequences. , 1982, Nucleic acids research.

[13]  Mahmood Akhtar,et al.  Signal Processing in Sequence Analysis: Advances in Eukaryotic Gene Prediction , 2008, IEEE Journal of Selected Topics in Signal Processing.

[14]  Jianchang Ning,et al.  Preliminary wavelet analysis of genomic sequences , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[15]  Dimitris Anastassiou,et al.  Frequency-domain analysis of biomolecular sequences , 2000, Bioinform..

[16]  Guangchen Liu,et al.  Identification of Protein Coding Regions in the Eukaryotic DNA Sequences Based on Marple Algorithm and Wavelet Packets Transform , 2014 .

[17]  P. P. Vaidyanathan Genomics and Proteomics: A Signal Processor's Tour , 2004 .

[18]  Bruce Alberts,et al.  Essential Cell Biology , 1983 .

[19]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[20]  Georges Bonnerot,et al.  Digital filtering by polyphase network:Application to sample-rate alteration and filter banks , 1976 .

[21]  E. Ambikairajah,et al.  An integer period DFT for biological sequence processing , 2008, 2008 IEEE International Workshop on Genomic Signal Processing and Statistics.

[22]  F. Crick,et al.  Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid , 1953, Nature.

[23]  Mohammed Abo-Zahhad,et al.  Genomic Analysis and Classification of Exon and Intron Sequences Using DNA Numerical Mapping Techniques , 2012 .

[24]  Ganapati Panda,et al.  An efficient signal processing approach in eukaryotic gene prediction , 2010 .

[25]  P D Cristea Conversion of nucleotides sequences into genomic signals , 2002, Journal of cellular and molecular medicine.

[26]  P. P. Vaidyanathan,et al.  The role of signal-processing concepts in genomics and proteomics , 2004, J. Frankl. Inst..

[27]  Dimitris Anastassiou,et al.  Genomic signal processing , 2001, IEEE Signal Process. Mag..

[28]  Dimitris Anastassiou DSP in genomics: processing and frequency-domain analysis of character strings , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[29]  E. Ambikairajah,et al.  On DNA Numerical Representations for Period-3 Based Exon Prediction , 2007, 2007 IEEE International Workshop on Genomic Signal Processing and Statistics.

[30]  N. Rao,et al.  Detection of 3-periodicity for small genomic sequences based on AR technique , 2004, 2004 International Conference on Communications, Circuits and Systems (IEEE Cat. No.04EX914).

[31]  Changchuan Yin,et al.  Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence. , 2007, Journal of theoretical biology.

[32]  R. Linsker,et al.  A measure of DNA periodicity. , 1986, Journal of theoretical biology.