Multiple sequence alignment using evolutionary programming

Multiple sequence alignment can be used as a tool for the identification of common structure in an ordered string of nucleotides (in DNA or RNA) or amino acids (in proteins). Current multiple sequence alignment algorithms work well for sequences with high similarity but do not scale well when either the length or number of the sequences is large or if the similarity is low. The focus of the paper is to develop an evolutionary programming (EP) algorithm for multiple sequence alignment. An EP method with representation specific variation operators is proposed and tested on several data sets. Comparisons to other algorithms suggests that this algorithm is well suited to the multiple sequence alignment problem.

[1]  C F Brunk,et al.  Phylogenetic relationships among Tetrahymena species determined using the polymerase chain reaction. , 1990, Journal of molecular evolution.

[2]  A. K. Wong,et al.  A survey of multiple sequence comparison methods. , 1992, Bulletin of mathematical biology.

[3]  Juan Seijas,et al.  Multiple protein sequence comparison by genetic algorithms , 1998, Defense, Security, and Sensing.

[4]  Liisa Holm,et al.  COFFEE: an objective function for multiple sequence alignments , 1998, Bioinform..

[5]  D. Lipman,et al.  Rapid similarity searches of nucleic acid and protein data banks. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[6]  D Sankoff,et al.  Matching sequences under deletion-insertion constraints. , 1972, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Andrew K. C. Wong,et al.  A genetic algorithm for multiple molecular sequence alignment , 1997, Comput. Appl. Biosci..

[8]  L. Sadler,et al.  Characterization of the promoter region of Tetrahymena genes. , 1990, Nucleic acids research.

[9]  W. A. Beyer,et al.  Some Biological Sequence Metrics , 1976 .

[10]  S. Osawa,et al.  Evolutionary change in 5S RNA secondary structure and a phylogenic tree of 54 5S RNA species. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Adam Godzik,et al.  Flexible algorithm for direct multiple alignment of protein structures and sequences , 1994, Comput. Appl. Biosci..

[12]  P. Sneath,et al.  Numerical Taxonomy , 1962, Nature.

[13]  Eugene W. Myers,et al.  Optimal alignments in linear space , 1988, Comput. Appl. Biosci..

[14]  Moon-Jung Chung,et al.  Multiple sequence alignment using simulated annealing , 1994, Comput. Appl. Biosci..

[15]  V. Sundararajan,et al.  Multiple Sequence Alignment Using Parallel Genetic Algorithms , 1998, SEAL.

[16]  Christophe G. Lambert,et al.  Comparative analysis of seven multiple protein sequence alignment servers: clues to enhance reliability of predictions , 1998, Bioinform..

[17]  A. Wong,et al.  Statistical analysis of residue variability in cytochrome c. , 1976, Journal of molecular biology.

[18]  D. S. Fields,et al.  An analysis of large rRNA sequences folded by a thermodynamic method. , 1996, Folding & design.

[19]  Peter H. Sellers,et al.  An Algorithm for the Distance Between Two Finite Sequences , 1974, J. Comb. Theory, Ser. A.

[20]  D. Higgins,et al.  RAGA: RNA sequence alignment by genetic algorithm. , 1997, Nucleic acids research.

[21]  D. K. Y. Chiu,et al.  A survey of multiple sequence comparison methods , 1992 .

[22]  D. Higgins,et al.  See Blockindiscussions, Blockinstats, Blockinand Blockinauthor Blockinprofiles Blockinfor Blockinthis Blockinpublication Clustal: Blockina Blockinpackage Blockinfor Blockinperforming Multiple Blockinsequence Blockinalignment Blockinon Blockina Minicomputer Article Blockin Blockinin Blockin , 2022 .

[23]  D. Higgins,et al.  SAGA: sequence alignment by genetic algorithm. , 1996, Nucleic acids research.

[24]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[25]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.