An efficient algorithm to detect palindromes in DNA sequences using periodicity transform

This paper presents an algorithm to detect exact and inexact palindromes of a given size in DNA sequences by applying signal processing techniques. The algorithm uses modified periodicity transform to calculate compensated periodogram coefficient, which acts as a correlation parameter for detection of inexact palindromes. Detailed experiments conducted on a number of synthetic and actual DNA sequences show efficiency and accuracy of our method for detecting inexact palindromes. A comparison between our method and the standard technique shows that the increase in execution time with large sample size is significantly less in our method as compared to the standard technique.

[1]  Eric Rivals,et al.  STAR: an algorithm to Search for Tandem Approximate Repeats , 2004, Bioinform..

[2]  William A. Sethares,et al.  Periodicity transforms , 1999, IEEE Trans. Signal Process..

[3]  J Whang-Peng,et al.  Inverted repeats as genetic elements for promoting DNA inverted duplication: implications in gene amplification. , 2001, Nucleic acids research.

[4]  J. Bissler,et al.  DNA inverted repeats and human disease. , 1998, Frontiers in bioscience : a journal and virtual library.

[5]  Dan Gusfield Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[6]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[7]  Zvi Galil,et al.  Parallel Detection of all Palindromes in a String , 1994, STACS.

[8]  Vera Afreixo,et al.  Spectrum and symbol distribution of nucleotide sequences. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  B. Trask,et al.  Short inverted repeats initiate gene amplification through the formation of a large DNA palindrome in mammalian cells , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  J. Stoye,et al.  REPuter: the manifold applications of repeat analysis on a genomic scale. , 2001, Nucleic acids research.

[11]  Suparerk Janjarasjitt,et al.  Detection and visualization of tandem repeats in DNA sequences , 2003, IEEE Trans. Signal Process..

[12]  P D Cristea Conversion of nucleotides sequences into genomic signals , 2002, Journal of cellular and molecular medicine.

[13]  Gary Benson,et al.  Inverted repeat structure of the human genome: the X-chromosome contains a preponderance of large, highly homologous inverted repeats that contain testes genes. , 2004, Genome research.

[14]  Dimitris Anastassiou,et al.  Genomic signal processing , 2001, IEEE Signal Process. Mag..