Improved time-domain approaches for locating exons in DNA using zero-phase filtering

Accurate prediction of exons locations in deoxyri-bonucleic acid (DNA) sequences is an important issue for geneticists. Time-domain periodogram (TDP) and average magnitude difference function (AMDF) are two time-domain approaches previously proposed for this purpose. These two approaches employ a second-order infinite impulse response (IIR) resonant filter as a preprocessing stage so as to emphasize the period-3 behavior exhibited by the exonic segments of DNA strands. The major drawback of IIR filters is their non-linear phase response, which results in a delay distortion experienced by the spectral components of the genomic signal at the filter output. This type of distortion affects the exons prediction accuracy of the TDP/AMDF classifier. This paper proposes the use of zero-phase filtering technique in the preprocessing stage so as to eliminate the phase distortion introduced by the traditional filtering. MATLAB simulation conducted on the ASP67 genomic dataset shows that the proposed modified time-domain approaches using zero-phase filtering reveal better performance, compared with the traditional approaches, in terms of the receiver operating characteristic (ROC) curve, precision-recall curve and F-measure.

[1]  S. Tiwari,et al.  Prediction of probable genes by Fourier analysis of genomic sequences , 1997, Comput. Appl. Biosci..

[2]  田中 俊典 National Center for Biotechnology Information (NCBI) , 2012 .

[3]  S. C. Kremer,et al.  Gene Prediction Based on DNA Spectral Analysis: A Literature Review , 2011, J. Comput. Biol..

[4]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[5]  Mohamed E. Khedr,et al.  A new multiple classifiers soft decisions fusion approach for exons prediction in DNA sequences , 2013, 2013 IEEE International Conference on Signal and Image Processing Applications.

[6]  Mahmood Akhtar,et al.  Signal Processing in Sequence Analysis: Advances in Eukaryotic Gene Prediction , 2008, IEEE Journal of Selected Topics in Signal Processing.

[7]  Mahmood Akhtar,et al.  Gene and exon prediction using time domain algorithms , 2005, Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005..

[8]  Andreas Antoniou,et al.  Filter-Based Methodology for the Location of Hot Spots in Proteins and Exons in DNA , 2012, IEEE Transactions on Biomedical Engineering.

[9]  E. Ambikairajah,et al.  On DNA Numerical Representations for Period-3 Based Exon Prediction , 2007, 2007 IEEE International Workshop on Genomic Signal Processing and Statistics.

[10]  A. Antoniou Digital Signal Processing: Signals, Systems, and Filters , 2005 .