Prediction of Transcription Factor Binding Sites Using Genetic Algorithm

Identification of transcription factor binding sites (TFBS) from the upstream region of genes remains a highly important and unsolved problem particularly in higher eukaryotic genomes. In this paper, we propose a new approach to predict TFBS. This approach uses position weight matrix (PWM) to represent binding sites and uses genetic algorithm (GA) to search the best matrix. A new coding method so called multiple-variable coding is proposed in GA. We apply it on two transcription factors rebl and mgl. The result shows that this approach can find most of the known sites, which indicates that this method is very effective

[1]  Armin Shmilovici,et al.  Identification of transcription factor binding sites with variable-order Bayesian networks , 2005, Bioinform..

[2]  Sriram Ramabhadran,et al.  Finding subtle motifs by branching from sample strings , 2003, ECCB.

[3]  Nir Friedman,et al.  Modeling dependencies in protein-DNA binding sites , 2003, RECOMB '03.

[4]  Douglas L. Brutlag,et al.  BioProspector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes , 2000, Pacific Symposium on Biocomputing.

[5]  M. Johnston,et al.  Regulated nuclear translocation of the Mig1 glucose repressor. , 1997, Molecular biology of the cell.

[6]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[7]  Masato Ishikawa,et al.  Automatic extraction of motifs represented in the hidden Markov model from a number of DNA sequences , 1998, Bioinform..

[8]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[9]  Michael Q. Zhang,et al.  SCPD: a promoter database of the yeast Saccharomyces cerevisiae , 1999, Bioinform..

[10]  M. Bulyk Computational prediction of transcription-factor binding site locations , 2003, Genome Biology.

[11]  J. Collado-Vides,et al.  Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. , 2000, Nucleic acids research.

[12]  Q. Ju,et al.  REB1, a yeast DNA-binding protein with many targets, is essential for growth and bears some resemblance to the oncogene myb , 1990, Molecular and cellular biology.

[13]  Gary D. Stormo,et al.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences , 1999, Bioinform..

[14]  F. Collins,et al.  A vision for the future of genomics research , 2003, Nature.