Parallel Simulated Annealing for Fragment Based Sequence Alignment

Finding a biologically relevant sequence alignment may be difficult since several sequence alignments are possible, taking different parameters in consideration. A perceptron neuron can be used to associate weights to a set of alignment characteristics and to decide if two residues should be aligned. Finding a good set of weights can be a hard problem and simulated annealing can be used for this purpose but it can take a long time. In this paper, we propose a parallelization strategy for simulated annealing optimizing a Fragment Based Alignment in Linear Space (FBALS). The results were superior to the competing algorithm and the obtained speedups were compatible with the number of processing cores, indicating a good parallel strategy.

[1]  Barbara Chapman,et al.  Using OpenMP - portable shared memory parallel programming , 2007, Scientific and engineering computation.

[2]  D. Janaki Ram,et al.  Parallel Simulated Annealing Algorithms , 1996, J. Parallel Distributed Comput..

[3]  Albert Y. Zomaya,et al.  Scaling up Genome Similarity Search Services through Content Distribution , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[4]  Eugene W. Myers,et al.  Optimal alignments in linear space , 1988, Comput. Appl. Biosci..

[5]  Shahram Khadivi,et al.  A Sequence Alignment Model Based on the Averaged Perceptron , 2007, EMNLP.

[6]  Milena Lazarova Parallel simulated annealing for solving the room assignment problem on shared and distributed memory platforms , 2008, CompSysTech.

[7]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[8]  Samuel Karlin,et al.  Protein length in eukaryotic and prokaryotic proteomes , 2005, Nucleic acids research.

[9]  M Ishikawa,et al.  Multiple sequence alignment by parallel simulated annealing , 1993, Comput. Appl. Biosci..

[10]  Michael Kaufmann,et al.  DIALIGN P: Fast pair-wise and multiple sequence alignment using parallel processors , 2004, BMC Bioinformatics.

[11]  Kun-Mao Chao,et al.  Linear-space algorithms that build local alignments from fragments , 1995, Algorithmica.

[12]  Azzedine Boukerche,et al.  A fragment based alignment in linear space , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[13]  Michael Kaufmann,et al.  DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment , 2008, Algorithms for Molecular Biology.

[14]  Stephen I. Gallant,et al.  Perceptron-based learning algorithms , 1990, IEEE Trans. Neural Networks.

[15]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[16]  Carlos A. Coello Coello,et al.  Asymptotic convergence of a simulated annealing algorithm for multiobjective optimization problems , 2006, Math. Methods Oper. Res..

[17]  Fatos Xhafa,et al.  A simulated annealing algorithm for router nodes placement problem in Wireless Mesh Networks , 2011, Simul. Model. Pract. Theory.

[18]  Daniel R. Greening,et al.  Parallel simulated annealing techniques , 1990 .

[19]  Pascal Bouvry,et al.  A parallel hybrid genetic algorithm-simulated annealing for solving Q3AP on computational grid , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[20]  Burkhard Morgenstern,et al.  DIALIGN: multiple DNA and protein sequence alignment at BiBiServ , 2004, Nucleic Acids Res..

[21]  Daniel S. Hirschberg,et al.  A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[22]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[23]  El-Ghazali Talbi,et al.  A grid-based genetic algorithm combined with an adaptive simulated annealing for protein structure prediction , 2008, Soft Comput..

[24]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[25]  Durbin,et al.  Biological Sequence Analysis , 1998 .

[26]  D. Mount Bioinformatics: Sequence and Genome Analysis , 2001 .

[27]  S Henikoff,et al.  Performance evaluation of amino acid substitution matrices , 1993, Proteins.

[28]  Christian Mazza Parallel Simulated Annealing , 1992, Random Struct. Algorithms.

[29]  Burkhard Morgenstern,et al.  DIALIGN: finding local similarities by multiple sequence alignment , 1998, Bioinform..

[30]  Rahul Siddharthan Sigma: multiple alignment of weakly-conserved non-coding DNA sequence , 2005, BMC Bioinformatics.

[31]  H. Imai,et al.  Enumerating suboptimal alignments of multiple biological sequences efficiently. , 1997, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[32]  Albert Y. Zomaya,et al.  A Simulated Annealing Approach for Mobile Location Management , 2005, IPDPS.

[33]  Azzedine Boukerche,et al.  A Hardware Accelerator for the Fast Retrieval of DIALIGN Biological Sequence Alignments in Linear Space , 2010, IEEE Transactions on Computers.

[34]  Emile H. L. Aarts,et al.  Simulated Annealing: Theory and Applications , 1987, Mathematics and Its Applications.