A Genetic Approach with Controlled Crossover and Guided Mutation for Biological Sequence Alignment

Sequence alignment is one of the most useful strategies in bioinformatics. Biological sequences accumulate mutation through the process of evolution which eventually transforms the residues in the sequences. Primarily sequence alignment is performed to find the level of similarity of an unknown sequence with a known one by identifying common pattern of residues. The pair of sequences may be of equal or unequal length of DNA or protein sequences. In this article, we have proposed a novel Genetic Algorithm (GA) based alignment technique with modified crossover and mutation operations for finding the best alignment of a sequence pair in an optimized way. We have compared the performance of the proposed method analytically and statistically with some other well known and relevant sequence alignment techniques. The result shows the superiority of the proposed genetic method with modified operators over other sequence alignment approaches.

[1]  Jin Xiong,et al.  Essential bioinformatics , 2006 .

[2]  James A. Foster,et al.  Multiple Sequence Alignment with Evolutionary Computation , 2004, Genetic Programming and Evolvable Machines.

[3]  Ruhul A. Sarker,et al.  Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment , 2011, BMC Bioinformatics.

[4]  W. Pearson Rapid and sensitive sequence comparison with FASTP and FASTA. , 1990, Methods in enzymology.

[5]  Héctor Pomares,et al.  Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns , 2013, Bioinform..

[6]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[7]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[8]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Andrew K. C. Wong,et al.  A genetic algorithm for multiple molecular sequence alignment , 1997, Comput. Appl. Biosci..

[10]  D. Higgins,et al.  SAGA: sequence alignment by genetic algorithm. , 1996, Nucleic acids research.

[11]  David W. Mount,et al.  Bioinformatics - sequence and genome analysis (2. ed.) , 2004 .

[12]  Vincenzo Cutello,et al.  Protein multiple sequence alignment by hybrid bio-inspired algorithms , 2011, Nucleic acids research.

[13]  C. Gondro,et al.  A simple genetic algorithm for multiple sequence alignment. , 2007, Genetics and molecular research : GMR.

[14]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[15]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..

[16]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[17]  Zne-Jung Lee,et al.  Genetic algorithm with ant colony optimization (GA-ACO) for multiple sequence alignment , 2008, Appl. Soft Comput..

[18]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.