Study of DNA binding sites using the Rényi parametric entropy measure.

Shannon's definition of uncertainty or surprisal has been applied extensively to measure the information content of aligned DNA sequences and characterizing DNA binding sites. In contrast to Shannon's uncertainty, this study investigates the applicability and suitability of a parametric uncertainty measure due to Rényi. It is observed that this measure also provides results in agreement with Shannon's measure, pointing to its utility in analysing DNA binding site region. For facilitating the comparison between these uncertainty measures, a dimensionless quantity called "redundancy" has been employed. It is found that Rényi's measure at low parameter values possess a better delineating feature of binding sites (of binding regions) than Shannon's measure. The critical value of the parameter is chosen with an outlier criterion.

[1]  K Frech,et al.  Computer-assisted prediction, classification, and delimitation of protein binding sites in nucleic acids. , 1993, Nucleic acids research.

[2]  Lila L. Gatlin,et al.  Information theory and the living system , 1972 .

[3]  José Manuel Gutiérrez,et al.  Multifractal analysis of DNA sequences using a novel chaos-game representation , 2001 .

[4]  Thomas D. Schneider Information and Entropy of Patterns in Genetic Switchs , 1988 .

[5]  Wentian Li,et al.  Statistical Properties of Open Reading Frames in Complete Genome Sequences , 1999, Comput. Chem..

[6]  M C O'Neill A general procedure for locating and analyzing protein-binding sequence motifs in nucleic acids. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Jawaharlal Karmeshu,et al.  Entropy Measures, Maximum Entropy Principle and Emerging Applications , 2003 .

[8]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[9]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[10]  T. D. Schneider,et al.  Information content of binding sites on nucleotide sequences. , 1986, Journal of molecular biology.

[11]  A. Rényi On Measures of Entropy and Information , 1961 .

[12]  C. Peng,et al.  Long-range correlations in nucleotide sequences , 1992, Nature.

[13]  P. Tiňo Multifractal properties of Hao's geometric representations of DNA sequences , 2002 .

[14]  C. Ray Smith,et al.  Maximum-entropy and Bayesian methods in science and engineering , 1988 .

[15]  D. S. Prestridge Predicting Pol II promoter sequences using transcription factor binding sites. , 1995, Journal of molecular biology.

[16]  A Hariri,et al.  On the validity of Shannon-information calculations for molecular biological sequences. , 1990, Journal of theoretical biology.

[17]  J. Fickett,et al.  Eukaryotic promoter recognition. , 1997, Genome research.

[18]  Henri Theil,et al.  Economics and information theory , 1967 .

[19]  W. Ebeling,et al.  Finite sample effects in sequence analysis , 1994 .

[20]  T. D. Schneider,et al.  Information content of individual genetic sequences. , 1997, Journal of theoretical biology.

[21]  Nikhil R. Pal,et al.  Uncertainty, Entropy and Maximum Entropy Principle — An Overview , 2003 .

[22]  T. Werner,et al.  MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data. , 1995, Nucleic acids research.

[23]  A. G. Bashkirov,et al.  Information entropy and power-law distributions for chaotic systems , 2000 .

[24]  I. Grosse,et al.  MEASURING CORRELATIONS IN SYMBOL SEQUENCES , 1995 .

[25]  G. Stormo,et al.  Identifying protein-binding sites from unaligned DNA fragments. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[26]  P. V. von Hippel,et al.  Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. , 1987, Journal of molecular biology.

[27]  Ivo Grosse,et al.  Applications of Recursive Segmentation to the Analysis of DNA Sequences , 2002, Comput. Chem..

[28]  Marco Buiatti,et al.  A non extensive approach to the entropy of symbolic sequences , 1999 .

[29]  C. Beck,et al.  Thermodynamics of chaotic systems : an introduction , 1993 .