An evolutionary approach for performing multiple sequence alignment

Despite of being a very common task in bioinformatics, multiple sequence alignment is not a trivial matter. Arranging a set of molecular sequences to reveal their similarities and their differences is often hardened by the complexity and the size of the search space involved, which undermine the approaches that try to explore exhaustively the solution's search space. Due to its nature, Genetic Algorithms, which are prone for general combinatorial problems optimization in large and complex search spaces, emerge as serious candidates to tackle with the multiple sequence alignment problem. We have developed an evolutionary approach, AlineaGA, which uses a Genetic Algorithm with local search optimization embedded on its mutation operators for performing multiple sequence alignment. Now, we have enhanced its selection method by employing an elitist strategy, and we have also developed a new crossover operator. These transformations allow AlineaGA to improve its robustness and to obtain better fit solutions. Also, we have studied the effect of the mutation probability in solutions' evolution by analyzing the performance of the whole population throughout generations. We conclude that increasing the mutation probability leads to better solutions in fewer generations and that the mutation operators have a dramatic effect in this particular domain.

[1]  Kenneth DeJong,et al.  Learning with genetic algorithms: An overview , 1988, Machine Learning.

[2]  P. Hogeweg,et al.  The alignment of sets of sequences and the construction of phyletic trees: An integrated method , 2005, Journal of Molecular Evolution.

[3]  S. Bandyopadhyay,et al.  Evolutionary computation in bioinformatics: a review , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[4]  William E. Hart,et al.  Memetic Evolutionary Algorithms , 2005 .

[5]  K. De Jong Learning with Genetic Algorithms: An Overview , 1988 .

[6]  L. A. Anbarasu,et al.  Multiple molecular sequence alignment by island parallel genetic algorithm , 2000 .

[7]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1992, Artificial Intelligence.

[8]  Jorng-Tzong Horng,et al.  A genetic algorithm for multiple sequence alignment , 2005, Soft Comput..

[9]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[10]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[11]  Erik L L Sonnhammer,et al.  Quality assessment of multiple alignment programs , 2002, FEBS letters.

[12]  Kumar Chellapilla,et al.  Multiple sequence alignment using evolutionary programming , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[13]  Patrick D. Surry,et al.  Inoculation to Initialise Evolutionary Search , 1996, Evolutionary Computing, AISB Workshop.

[14]  Cheng-Yan Kao,et al.  Using Genetic Algorithms to Solve Multiple Sequence Alignments , 2000, GECCO.

[15]  D. Higgins,et al.  RAGA: RNA sequence alignment by genetic algorithm. , 1997, Nucleic acids research.

[16]  J. D. Thompson,et al.  Multiple alignment of complete sequences (MACS) in the post-genomic era. , 2001, Gene.

[17]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[18]  Olivier Poch,et al.  BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs , 1999, Bioinform..

[19]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[20]  Cédric Notredame,et al.  Recent Evolutions of Multiple Sequence Alignment Algorithms , 2007, PLoS Comput. Biol..

[21]  Chunlin Wang,et al.  Genomic multiple sequence alignments: refinement using a genetic algorithm , 2005, BMC Bioinformatics.

[22]  Terence C. Fogarty,et al.  Microprocessor design verification by two-phase evolution of variable length tests , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[23]  Miguel A. Vega-Rodríguez,et al.  Optimizing Multiple Sequence Alignment by Improving Mutation Operators of a Genetic Algorithm , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[24]  Miguel A. Vega-Rodríguez,et al.  AlineaGA: A Genetic Algorithm for Multiple Sequence Alignment , 2008, New Challenges in Applied Intelligence Technologies.

[25]  D. Higgins,et al.  SAGA: sequence alignment by genetic algorithm. , 1996, Nucleic acids research.

[26]  Joerg joke Heitkoetter,et al.  The hitch-hiker''s guide to evolutionary computation , 2001 .

[27]  Miguel A. Vega-Rodríguez,et al.  AlineaGA—a genetic algorithm with local search optimization for multiple sequence alignment , 2010, Applied Intelligence.