论文信息 - Multiple Sequence Alignment Using Tabu Search

Multiple Sequence Alignment Using Tabu Search

Tabu search is a meta-heuristic approach that is found to be useful in solving combinatorial optimization problems. We implement the adaptive memory features of tabu search to align multiple sequences. Adaptive memory helps the search process to avoid local optima and explores the solution space economically and effectively without getting trapped into cycles. The algorithm is further enhanced by introducing extended tabu search features such as intensification and diversification. It intensifies by bringing the search process to poorly aligned regions of an elite solution, and softly diversifies by moving from one poorly aligned region to another. The neighborhoods of a solution are generated stochastically and a consistency-based objective function is employed to measure its quality. The algorithm is tested with the datasets from BAliBASE benchmarking database. We have observed through experiments that for datasets comprising orphan sequences, divergent families and long internal insertions, tabu search generates better alignment as compared to other methods studied in this paper. The source code of our tabu search algorithm is available at http://www.bii.a-star.edu.sg/~tariq/tabu/ .

Yi Wang | Kuo-Bin Li | Tariq Riaz

[1] R. F. Smith,et al. Pattern-induced multi-sequence alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. , 1992, Protein engineering.

[2] Mhand Hifi,et al. An Efficient Algorithm for the Knapsack Sharing Problem , 2002, Comput. Optim. Appl..

[3] O. Gotoh. Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. , 1996, Journal of molecular biology.

[4] Michel Gendreau,et al. Cooperative Parallel Tabu Search for Capacitated Network Design , 2002, J. Heuristics.

[5] M Ishikawa,et al. Multiple sequence alignment by parallel simulated annealing , 1993, Comput. Appl. Biosci..

[6] R. Doolittle,et al. Progressive sequence alignment as a prerequisitetto correct phylogenetic trees , 2007, Journal of Molecular Evolution.

[7] Olivier Poch,et al. BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs , 1999, Bioinform..

[8] Olivier Poch,et al. BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations , 2001, Nucleic Acids Res..

[9] Andrew K. C. Wong,et al. A genetic algorithm for multiple molecular sequence alignment , 1997, Comput. Appl. Biosci..

[10] J. D. Thompson,et al. Towards a reliable objective function for multiple sequence alignments. , 2001, Journal of molecular biology.

[11] Smith Rf,et al. Pattern-induced multi-sequence alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. , 1992 .