The Minimum Entropy Mapping Spectrum of a DNA Sequence

Periodicities in a DNA sequence contain information about its function and structure. Frequency analysis can be used to reveal such periodicities. A definition of spectrum for DNA sequences is proposed, based on an entropy minimization criterion. It is shown that the properties of the proposed spectrum are those expected from the spectrum of a symbolic sequence. Applications to simulated and experimental DNA sequences are presented and compared against classical techniques to prove the effectiveness of the proposed method.

[1]  V. Chechetkin,et al.  REVIEWS OF TOPICAL PROBLEMS: Order and correlations in genomic DNA sequences. The spectral approach , 2000 .

[2]  Dimitris Anastassiou,et al.  Spectrogram Analysis of Genomes , 2004, EURASIP J. Adv. Signal Process..

[3]  J. Tuqan,et al.  The Filtered Spectral Rotation Measure , 2006, 2006 Fortieth Asilomar Conference on Signals, Systems and Computers.

[4]  Trevor W. Fox,et al.  A Digital Signal Processing Method for Gene Prediction with Improved Noise Suppression , 2004, EURASIP J. Adv. Signal Process..

[5]  William A. Sethares,et al.  Latent Periodicities in Genome Sequences , 2008, IEEE Journal of Selected Topics in Signal Processing.

[6]  J. Fickett,et al.  Assessment of protein coding measures. , 1992, Nucleic acids research.

[7]  R. Linsker,et al.  A measure of DNA periodicity. , 1986, Journal of theoretical biology.

[8]  J. Tuqan,et al.  Multirate DSP models for gene detection , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..

[9]  Matthias E. Futschik,et al.  DNA Motifs and Sequence Periodicities , 2006, Silico Biol..

[10]  J. Todd,et al.  Microsatellites for linkage analysis of genetic traits. , 1992, Trends in genetics : TIG.

[11]  G. Rosen Examining coding structure and redundancy in DNA , 2006, IEEE Engineering in Medicine and Biology Magazine.

[12]  C. Peng,et al.  Long-range correlations in nucleotide sequences , 1992, Nature.

[13]  D. Relman,et al.  Microbial Forensics--"Cross-Examining Pathogens" , 2002, Science.

[14]  Wentian Li,et al.  The Study of Correlation Structures of DNA Sequences: A Critical Review , 1997, Comput. Chem..

[15]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[16]  Yizhar Lavner,et al.  Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions. , 2003, Genome research.

[17]  Leonidas D. Iasemidis,et al.  Autoregressive Modeling and Feature Analysis of DNA Sequences , 2004, EURASIP J. Adv. Signal Process..

[18]  John M. Butler,et al.  Forensic DNA typing : biology & technology behind STR markers , 2001 .

[19]  Alan V. Oppenheim,et al.  Discrete-time Signal Processing. Vol.2 , 2001 .

[20]  E. Ambikairajah,et al.  On DNA Numerical Representations for Period-3 Based Exon Prediction , 2007, 2007 IEEE International Workshop on Genomic Signal Processing and Statistics.

[21]  P D Cristea Conversion of nucleotides sequences into genomic signals , 2002, Journal of cellular and molecular medicine.

[22]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[23]  J. Fickett Recognition of protein coding regions in DNA sequences. , 1982, Nucleic acids research.

[24]  Dimitris Anastassiou,et al.  Frequency-domain analysis of biomolecular sequences , 2000, Bioinform..

[25]  Sanjit K. Mitra,et al.  Power spectrum analysis for DNA sequences , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..

[26]  Yuan-Ting Zhang,et al.  Signal processing techniques in genomic engineering , 2002, Proc. IEEE.

[27]  J. Claverie Computational methods for the identification of genes in vertebrate genomic sequences. , 1997, Human molecular genetics.

[28]  Jianchang Ning,et al.  Preliminary wavelet analysis of genomic sequences , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[29]  J. Butler,et al.  Forensic DNA Typing: Biology and Technology behind STR Markers , 2002, Heredity.

[30]  João M. F. Xavier,et al.  Detection of statistical periodicities in DNA by Conflict and Entropy Minimization Methods , 2008, 2008 16th European Signal Processing Conference.

[31]  V. Chechetkin,et al.  Size-dependence of three-periodicity and long-range correlations in DNA sequences , 1995 .

[32]  Hanspeter Herzel,et al.  Periodicities of 10-11bp as indicators of the supercoiled state of genomic DNA. , 2004, Journal of molecular biology.

[33]  E. Dougherty,et al.  Genomic Signal Processing and Statistics , 2005 .

[34]  Andrzej K. Brodzik,et al.  Symbol-balanced quaternionic periodicity transform for latent pattern detection in DNA sequences , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[35]  S. K. Moore,et al.  Understanding the human genome , 1990 .

[36]  P. P. Vaidyanathan,et al.  The role of signal-processing concepts in genomics and proteomics , 2004, J. Frankl. Inst..

[37]  Dimitris Anastassiou,et al.  Genomic signal processing , 2001, IEEE Signal Process. Mag..

[38]  R Zhang,et al.  Z curves, an intutive tool for visualizing and analyzing the DNA sequences. , 1994, Journal of biomolecular structure & dynamics.

[39]  Jamal Tuqan,et al.  A DSP Approach for Finding the Codon Bias in DNA Sequences , 2008, IEEE Journal of Selected Topics in Signal Processing.

[40]  E.R. Dougherty,et al.  Research issues in genomic signal processing , 2005, IEEE Signal Processing Magazine.

[41]  Wei Wang,et al.  Computing linear transforms of symbolic signals , 2002, IEEE Trans. Signal Process..

[42]  P. Vandergheynst,et al.  Fourier and wavelet transform analysis, a tool for visualizing regular patterns in DNA sequences. , 2000, Journal of theoretical biology.

[43]  Moncef Gabbouj,et al.  Finite-Set Signal Processing , 2006 .

[44]  E. Trifonov 3-, 10.5-, 200- and 400-base periodicities in genome sequences , 1998 .