Graphical approach for motif recognition in DNA sequences

Several algorithms have been developed for motif recognition in the past few years, superior in some sense over others, yet not a single one was declared to be the "best". Some of the well recognized algorithms are based on heuristic methods, such as Gibbs sampling and expectation maximization, and enumeration methods, such as Oligo-analysis. However, the inability to solve the "Challenge Problem" in motif recognition showed the drawbacks of the existing heuristic and enumeration methods. Two new algorithms were developed to resolve this problem but still, they suffered from time and space expense and the problem of local optima. We proposed a new algorithm which can solve the challenge problem with better performance even in very long sequences by applying dynamic programming for path searching in a graph and scanning with the consensus sequence to eliminate faked motif instances.

[1]  J. Collado-Vides,et al.  Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. , 1998, Journal of molecular biology.

[2]  M. Blanchette,et al.  Discovery of regulatory elements by a computational method for phylogenetic footprinting. , 2002, Genome research.

[3]  Uri Keich,et al.  U Subtle motifs: defining the limits of motif finding algorithms , 2002, Bioinform..

[4]  J. Liu,et al.  Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. , 2001, Nucleic acids research.

[5]  Gary D. Stormo,et al.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences , 1999, Bioinform..

[6]  Pavel A. Pevzner,et al.  Combinatorial Approaches to Finding Subtle Signals in DNA Sequences , 2000, ISMB.

[7]  Gary D. Stormo,et al.  Identification of consensus patterns in unaligned DNA sequences known to be functionally related , 1990, Comput. Appl. Biosci..

[8]  J. Collado-Vides,et al.  Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. , 2000, Nucleic acids research.

[9]  Douglas L. Brutlag,et al.  BioProspector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes , 2000, Pacific Symposium on Biocomputing.

[10]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[11]  Xiaole Liu,et al.  Statistical models for biological sequence motif discovery , 2002 .

[12]  Martin Tompa,et al.  An Exact Method for Finding Short Motifs in Sequences, with Application to the Ribosome Binding Site Problem , 1999, ISMB.

[13]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.