Detection of transcription factor binding sites using Rényi entropy

During the process of protein synthesis, transcription of DNA to messenger RNA starts with the binding of the transcription factors to the promoter. One of the issues on the prediction of transcription factor binding is that sequences corresponding to the binding present variability. In this manuscript a method for the detection of binding site is proposed, based on a parametric uncertainty measurement (Renyi entropy). This measurement is done through an estimation of the probability for each nucleotide avoiding any numerical representation of the nucleotides. We obtain values of the efficiency of the method as receiver operating characteristic curves found on ABF1 and ROX1 binding sites in chromosome I and XVI of the organism Saccharomyces cerevisiae.

[1]  Andres Cicuttin,et al.  Entropic approach to information coding in DNA molecules , 2001 .

[2]  W. Wong,et al.  CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Robert Castelo,et al.  Splice site identification by idlBNs , 2004, ISMB/ECCB.

[4]  D. Guhathakurta,et al.  Computational identification of transcriptional regulatory elements in DNA sequence , 2006, Nucleic acids research.

[5]  Yunlong Liu,et al.  Principal component analysis for predicting transcription-factor binding motifs from array-derived data , 2005, BMC Bioinformatics.

[6]  Hermann J. Muller,et al.  The Gene Material as the Initiator and the Organizing Basis of Life , 1966, The American Naturalist.

[7]  Karmeshu,et al.  Study of DNA binding sites using the Rényi parametric entropy measure. , 2004, Journal of theoretical biology.

[8]  T. D. Schneider,et al.  Information content of binding sites on nucleotide sequences. , 1986, Journal of molecular biology.

[9]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[10]  Christopher B. Burge,et al.  Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals , 2003, RECOMB '03.

[11]  Robert A Gatenby,et al.  Information Theory in Living Systems, Methods, Applications, and Challenges , 2007, Bulletin of mathematical biology.

[12]  Bart De Moor,et al.  Computational detection of cis-regulatory modules , 2003, ECCB.

[13]  Robert C. Edgar,et al.  MUSCLE : Low-complexity multiple sequence alignment with T-Coffee accuracy , 2004 .

[14]  Graziano Pesole,et al.  An algorithm for finding signals of unknown length in DNA sequences , 2001, ISMB.

[15]  Saurabh Sinha,et al.  A probabilistic method to detect regulatory modules , 2003, ISMB.

[16]  Xin Chen,et al.  TRANSFAC: an integrated system for gene expression regulation , 2000, Nucleic Acids Res..

[17]  J. Collado-Vides,et al.  Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. , 2000, Nucleic acids research.

[18]  T. D. Schneider,et al.  Information content of individual genetic sequences. , 1997, Journal of theoretical biology.

[19]  G. K. Sandve,et al.  A survey of motif discovery methods in an integrated framework , 2006, Biology Direct.