Multiple Sequence Alignment Using Tabu Search

Tabu search is a meta-heuristic approach that is found to be useful in solving combinatorial optimization problems. We implement the adaptive memory features of tabu search to align multiple sequences. Adaptive memory helps the search process to avoid local optima and explores the solution space economically and effectively without getting trapped into cycles. The algorithm is further enhanced by introducing extended tabu search features such as intensification and diversification. It intensifies by bringing the search process to poorly aligned regions of an elite solution, and softly diversifies by moving from one poorly aligned region to another. The neighborhoods of a solution are generated stochastically and a consistency-based objective function is employed to measure its quality. The algorithm is tested with the datasets from BAliBASE benchmarking database. We have observed through experiments that for datasets comprising orphan sequences, divergent families and long internal insertions, tabu search generates better alignment as compared to other methods studied in this paper. The source code of our tabu search algorithm is available at http://www.bii.a-star.edu.sg/~tariq/tabu/ .

[1]  R. F. Smith,et al.  Pattern-induced multi-sequence alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. , 1992, Protein engineering.

[2]  Mhand Hifi,et al.  An Efficient Algorithm for the Knapsack Sharing Problem , 2002, Comput. Optim. Appl..

[3]  O. Gotoh Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. , 1996, Journal of molecular biology.

[4]  Michel Gendreau,et al.  Cooperative Parallel Tabu Search for Capacitated Network Design , 2002, J. Heuristics.

[5]  M Ishikawa,et al.  Multiple sequence alignment by parallel simulated annealing , 1993, Comput. Appl. Biosci..

[6]  R. Doolittle,et al.  Progressive sequence alignment as a prerequisitetto correct phylogenetic trees , 2007, Journal of Molecular Evolution.

[7]  Olivier Poch,et al.  BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs , 1999, Bioinform..

[8]  Olivier Poch,et al.  BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations , 2001, Nucleic Acids Res..

[9]  Andrew K. C. Wong,et al.  A genetic algorithm for multiple molecular sequence alignment , 1997, Comput. Appl. Biosci..

[10]  J. D. Thompson,et al.  Towards a reliable objective function for multiple sequence alignments. , 2001, Journal of molecular biology.

[11]  Smith Rf,et al.  Pattern-induced multi-sequence alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. , 1992 .

[12]  F. Corpet Multiple sequence alignment with hierarchical clustering. , 1988, Nucleic acids research.

[13]  A. Dress,et al.  Multiple DNA and protein sequence alignment based on segment-to-segment comparison. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[14]  C. Notredame,et al.  Recent progress in multiple sequence alignment: a survey. , 2002, Pharmacogenomics.

[15]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[16]  Kevin Karplus,et al.  Evaluation of protein multiple alignments by SAM-T99 using the BAliBASE multiple alignment test set , 2001, Bioinform..

[17]  Alice E. Smith,et al.  Efficiently Solving the Redundancy Allocation Problem Using Tabu Search , 2003 .

[18]  Xin Liu,et al.  Protein Conformation of a Lattice Model Using Tabu Search , 1997, J. Glob. Optim..

[19]  Manuel Laguna,et al.  Tabu Search , 1997 .

[20]  Sandeep K. Gupta,et al.  Improving the Practical Space and Time Efficiency of the Shortest-Paths Approach to Sum-of-Pairs Multiple Sequence Alignment , 1995, J. Comput. Biol..

[21]  Liisa Holm,et al.  COFFEE: an objective function for multiple sequence alignments , 1998, Bioinform..

[22]  D. Lipman,et al.  The multiple sequence alignment problem in biology , 1988 .

[23]  D. Higgins,et al.  SAGA: sequence alignment by genetic algorithm. , 1996, Nucleic acids research.

[24]  Fred W. Glover,et al.  A user's guide to tabu search , 1993, Ann. Oper. Res..

[25]  Hongbin Zhang,et al.  Feature selection using tabu search method , 2002, Pattern Recognit..

[26]  Olivier Poch,et al.  RASCAL: Rapid Scanning and Correction of Multiple Sequence Alignments , 2003, Bioinform..

[27]  Moon-Jung Chung,et al.  Multiple sequence alignment using simulated annealing , 1994, Comput. Appl. Biosci..

[28]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[29]  Arne Thesen,et al.  Design and Evaluation of Tabu Search Algorithms for Multiprocessor Scheduling , 1998, J. Heuristics.

[30]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[31]  Alexander V. Lukashin,et al.  Local multiple sequence alignment using dead-end elimination , 1999, Bioinform..