Identifying Transcription Factor Binding Sites Based on a Neural Network

The identification of regulatory motifs (transcription factor binding sites) in DNA sequences is a difficult pattern recognition problem. Many methods have been developed in the past few years. Although some are better than the others in a sense, yet not a single one is recognized to be the best. Generally, in the case of long and subtle motifs, exhaustive enumeration becomes problematic. In this paper,we present a new method which improves exhaustive enumeration based on a neural network. We test its performance on both synthetic data and realistic biological data. It proved to be successful in identifying very subtle motifs. Experiments also show our method outperforms some popular methods in terms of identifying subtle motifs. We refer to the new method as IMNN (Identifying Motifs based on a Neural Network).