Sequence alignment from the perspective of stochastic optimization: A survey

DNA and protein are the fundamental biological sequences. DNA is a fundamental molecule that plays a vital role in the processes of life. Proteins synthesized by DNA in a cell are the building blocks of every living organism. There is a variety of reasons behind the alignment of biological sequences. Biological sequence alignment helps to discover functional and structural similarity of sequences. Biologists work with these aligned sequences to construct phylogenetic trees, characterize protein families, and predict protein structure. Sequence alignment is an extremely promising field of research that is characterized by very high computational complexity. Stochastic optimization is needed for sequence alignment, as it generates efficient solutions to the problem. The objective of this study is to survey recent trends in stochastic optimization for sequence alignment as means of a guide for researchers who are interested in the sequence alignment problem.

[1]  Pedro F. Rodriguez,et al.  Multiple sequence alignment using swarm intelligence , 2007 .

[2]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[3]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[4]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[5]  Kumar Chellapilla,et al.  Multiple sequence alignment using evolutionary programming , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[6]  D. Higgins,et al.  See Blockindiscussions, Blockinstats, Blockinand Blockinauthor Blockinprofiles Blockinfor Blockinthis Blockinpublication Clustal: Blockina Blockinpackage Blockinfor Blockinperforming Multiple Blockinsequence Blockinalignment Blockinon Blockina Minicomputer Article Blockin Blockinin Blockin , 2022 .

[7]  J. Richardson,et al.  Simultaneous comparison of three protein sequences. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[8]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[9]  Yuehui Chen,et al.  A Method for Multiple Sequence Alignment Based on Particle Swarm Optimization , 2009, ICIC.

[10]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[11]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[12]  Xuyu Xiang,et al.  Multiple sequence alignment algorithm based on a dispersion graph and ant colony algorithm , 2009, J. Comput. Chem..

[13]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[14]  Shun-Feng Su,et al.  Multiple sequence alignment using modified dynamic programming and particle swarm optimization , 2008 .

[15]  Jorng-Tzong Horng,et al.  A genetic algorithm for multiple sequence alignment , 2005, Soft Comput..

[16]  J. Deneubourg,et al.  Self-organized shortcuts in the Argentine ant , 1989, Naturwissenschaften.

[17]  Corso Elvezia,et al.  Ant colonies for the traveling salesman problem , 1997 .

[18]  Rosni Abdullah,et al.  Multiple Sequence Alignment Using Optimization Algorithms , 2007 .

[19]  H. Griffin,et al.  The European Bioinformatics Institute , 1995 .

[20]  Olivier Poch,et al.  BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs , 1999, Bioinform..

[21]  Jacquelyn S. Fetrow,et al.  Structural genomics and its importance for gene function analysis , 2000, Nature Biotechnology.

[22]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[23]  Yi Pan,et al.  Multiple Sequence Alignment by Ant Colony Optimization and Divide-and-Conquer , 2006, International Conference on Computational Science.

[24]  C. Notredame,et al.  Recent progress in multiple sequence alignment: a survey. , 2002, Pharmacogenomics.

[25]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[26]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[27]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[28]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[29]  Luca Maria Gambardella,et al.  Solving symmetric and asymmetric TSPs by ant colonies , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[30]  M Hernández-Guía,et al.  Simulated annealing algorithm for the multiple sequence alignment problem: the approach of polymers in a random medium. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  James A. Foster,et al.  Multiple Sequence Alignment with Evolutionary Computation , 2004, Genetic Programming and Evolvable Machines.

[32]  Xiujuan Lei,et al.  Multiple Sequence Alignment Based on Chaotic PSO , 2009 .

[33]  M Ishikawa,et al.  Multiple sequence alignment by parallel simulated annealing , 1993, Comput. Appl. Biosci..

[34]  J. Deneubourg,et al.  Trails and U-turns in the Selection of a Path by the Ant Lasius niger , 1992 .

[35]  Yanchun Liang,et al.  A Hidden Markov Model and Immune Particle Swarm Optimization-Based Algorithm for Multiple Sequence Alignment , 2005, Australian Conference on Artificial Intelligence.

[36]  Jianming Shi,et al.  A Modified Algorithm for Sequence Alignment Using Ant Colony System , 2009 .

[37]  D. Higgins,et al.  SAGA: sequence alignment by genetic algorithm. , 1996, Nucleic acids research.

[38]  M S Waterman,et al.  Sequence alignment and penalty choice. Review of concepts, case studies and implications. , 1994, Journal of molecular biology.

[39]  R. Doolittle,et al.  Progressive sequence alignment as a prerequisitetto correct phylogenetic trees , 2007, Journal of Molecular Evolution.

[40]  Moon-Jung Chung,et al.  Multiple sequence alignment using simulated annealing , 1994, Comput. Appl. Biosci..

[41]  Hao Liu,et al.  An ant colony pairwise alignment based on the dot plots , 2009, J. Comput. Chem..

[42]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[43]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Volkan Uslan,et al.  Microarray Image Segmentation Using Clustering Methods , 2010 .

[45]  M Dorigo,et al.  Ant colonies for the travelling salesman problem. , 1997, Bio Systems.

[46]  Jianming Shi,et al.  Prediction of MHC class II binders using the ant colony search strategy , 2005, Artif. Intell. Medicine.

[47]  A. Leach Molecular Modelling: Principles and Applications , 1996 .

[48]  Christian Blum,et al.  Ant colony optimization: Introduction and recent trends , 2005 .

[49]  R. Eberhart,et al.  Empirical study of particle swarm optimization , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[50]  Katsutoshi Takahashi,et al.  An Approach to Amino Acid Sequence Alignment Using a Genetic Algorithm , 1995 .

[51]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[52]  Andrew K. C. Wong,et al.  A genetic algorithm for multiple molecular sequence alignment , 1997, Comput. Appl. Biosci..

[53]  Peter Adams,et al.  A simulated annealing algorithm for finding consensus sequences , 2002, Bioinform..

[54]  Colin G. Johnson,et al.  An ant colony algorithm for multiple sequence alignment in bioinformatics , 2003, ICANNGA.

[55]  Zne-Jung Lee,et al.  Genetic algorithm with ant colony optimization (GA-ACO) for multiple sequence alignment , 2008, Appl. Soft Comput..

[56]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[57]  Etsuko N. Moriyama,et al.  Gap Profiling : Scoring Indels in Multiple Sequence Alignment , 2009 .