Accelerating search tree based brute force motif discovery technique on a processor cluster

Determination of conserved regions that plays vital roles on regulation of transcription and translation processes is one of the most challenging problems in bioinformatics. However, with the increasing power of distributed computing systems, solving these types of combinatorial problems by utilizing parallelized brute force or exhaustive search algorithms recently has gained popularity. In this paper, we investigated the parallelized implementation of a search tree based brute force technique to find motifs with different lengths. Experimental studies showed that parallelization of the brute force techniques with less communication overhead is significantly increased the usability of them to analyze long nucleotide sequences.

[1]  Michael Q. Zhang Computational prediction of eukaryotic protein-coding genes , 2002, Nature Reviews Genetics.

[2]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[3]  M. Brent,et al.  Recent advances in gene structure prediction. , 2004, Current opinion in structural biology.

[4]  Sanguthevar Rajasekaran,et al.  Efficient sequential and parallel algorithms for planted motif search , 2013, BMC Bioinformatics.

[5]  A. Krogh 11 – Gene Finding: Putting the Parts Together , 1998 .

[6]  A. A. Reilly,et al.  An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences , 1990, Proteins.

[7]  J. Do,et al.  Computational approaches to gene prediction. , 2006, Journal of microbiology.

[8]  Ujjwal Maulik,et al.  Gene Identification: Classical and Computational Intelligence Approaches , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[9]  Miguel A. Vega-Rodríguez,et al.  Solving the motif discovery problem by using Differential Evolution with Pareto Tournaments , 2010, IEEE Congress on Evolutionary Computation.

[10]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[11]  Neelam Goel,et al.  A comparative analysis of soft computing techniques for gene prediction. , 2013, Analytical biochemistry.

[12]  R. Podlipná,et al.  Understanding Bioinformatics , 2009, Biologia Plantarum.

[14]  Roy D. Sleator,et al.  An overview of the current status of eukaryote gene prediction strategies. , 2010, Gene.

[15]  P. Rouzé,et al.  Current methods of gene prediction, their strengths and weaknesses. , 2002, Nucleic acids research.

[16]  L Milanesi,et al.  Protein-coding regions prediction combining similarity searches and conservative evolutionary properties of protein-coding sequences. , 1999, Gene.

[17]  Valentin I. Spitkovsky,et al.  A dictionary based approach for gene annotation , 1999, J. Comput. Biol..

[18]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[19]  Mehmet Kaya,et al.  MOGAMOD: Multi-objective genetic algorithm for motif discovery , 2009, Expert Syst. Appl..

[20]  G. Stormo Gene-finding approaches for eukaryotes. , 2000, Genome research.

[21]  Jacques Cohen,et al.  Bioinformatics—an introduction for computer scientists , 2004, CSUR.