Planted (l, d) - Motif Finding using Particle Swarm Optimization

In Bioinformatics, Motif Finding is one of the most popular problems, which has many applications. Generally, it is to locate recurring patterns in the sequence of nucleotides or amino acids. As we can’t expect the pattern to be exact matching copies owing to biological mutations, the motif finding turns to be an NPcomplete problem. By approximating the same in different aspects, scientists have provided many solutions in the literature. The most of the algorithms suffer with local optima. Particle swarm optimization (PSO) is a new global optimization technique which has wide applications. It finds the global best solution by simply adjusting the trajectory of each individual towards its own best location and towards the best particle of the swarm at each generation. We have adopted the features of the PSO to solve the Planted Motif Finding Problem and have designed a sequential algorithm. We have performed experiments with simulated data it outperforms MbGA and PbGA. The PMbPSO also applied for real biological data sets and observe that the algorithm is also able to detect known TFBS accurately when there are no mutations. General Terms: Evolutionary Optimization Techniques, Bioinformatics, Computational Biology.

[1]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[2]  Ajith Abraham,et al.  Computational Intelligence in Solving Bioinformatics Problems: Reviews, Perspectives, and Challenges , 2008, Computational Intelligence in Biomedicine and Bioinformatics.

[3]  Yan Wang,et al.  A Novel Computational Based Method for Discovery of Sequence Motifs from Coexpressed Genes , 2005 .

[4]  Yanxin Huang,et al.  Identification of Transcription Factor Binding Sites Using Hybrid Particle Swarm Optimization , 2005, RSFDGrC.

[5]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[6]  Pavel A. Pevzner,et al.  Combinatorial Approaches to Finding Subtle Signals in DNA Sequences , 2000, ISMB.

[7]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[8]  H. K. Dai,et al.  A survey of DNA motif finding algorithms , 2007, BMC Bioinformatics.

[9]  Carlos A. Brizuela,et al.  Comparison of Simple Encoding Schemes in GA's for the Motif Finding Problem: Preliminary Results , 2007, BSB.

[10]  Saman K. Halgamuge,et al.  Particle Swarm Optimisation for Protein Motif Discovery , 2004, Genetic Programming and Evolvable Machines.

[11]  Sriram Ramabhadran,et al.  Finding subtle motifs by branching from sample strings , 2003, ECCB.

[12]  Jeremy Buhler,et al.  Finding motifs using random projections , 2001, RECOMB.

[13]  Eric C. Rouchka,et al.  DNA motif detection using particle swarm optimization and expectation-maximization , 2005, Proceedings 2005 IEEE Swarm Intelligence Symposium, 2005. SIS 2005..

[14]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[15]  Krzysztof J. Cios,et al.  Computational intelligence in solving bioinformatics problems , 2005, Artif. Intell. Medicine.

[16]  Yuh-Jyh Hu,et al.  Finding subtle motifs with variable gaps in unaligned DNA sequences , 2003, Comput. Methods Programs Biomed..

[17]  C. Elkan,et al.  Unsupervised learning of multiple motifs in biopolymers using expectation maximization , 1995, Machine Learning.

[18]  Jeremy Buhler,et al.  Finding Motifs Using Random Projections , 2002, J. Comput. Biol..

[19]  Yanwen Li,et al.  Identification of Transcription Factor Binding Sites Using GA and PSO , 2006, Sixth International Conference on Intelligent Systems Design and Applications.