A Memetic Algorithm for closest string problem and farthest string problem

Sequences consensus problems are especially important in studying molecular evolution, protein structures, and drug target design. In this paper, we work on two of these problems, namely closest string problem and farthest string problem. These problems are NP-hard, and none of exact algorithms already proposed to solve them is in polynomial time. Many non-exact algorithms have been proposed which try to obtain ‘good’ solutions in acceptable time for these problems. In this paper, a Memetic Algorithm (MA) is proposed for the closest string problem, which outperforms the existing algorithms. We then extend the proposed algorithm to address the farthest string problem.

[1]  R. Lewontin ‘The Selfish Gene’ , 1977, Nature.

[2]  Xiaolan Liu,et al.  A Compounded Genetic and Simulated Annealing Algorithm for the Closest String Problem , 2008, 2008 2nd International Conference on Bioinformatics and Biomedical Engineering.

[3]  Ari Trachtenberg,et al.  Identifying Codes and Covering Problems , 2008, IEEE Transactions on Information Theory.

[4]  Kaizhong Zhang,et al.  Algorithmic approaches for genome rearrangement: a review , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[5]  E. Krebs,et al.  Consensus sequences as substrate specificity determinants for protein kinases and protein phosphatases. , 1991, The Journal of biological chemistry.

[6]  Bin Ma,et al.  Finding similar regions in many strings , 1999, STOC '99.

[7]  Hisao Ishibuchi,et al.  Implementation of Simple Multiobjective Memetic Algorithms and Its Applications to Knapsack Problems , 2004, Int. J. Hybrid Intell. Syst..

[8]  P. Pardalos,et al.  Optimization techniques for string selection and comparison problems in genomics , 2005, IEEE Engineering in Medicine and Biology Magazine.

[9]  Bin Ma,et al.  More Efficient Algorithms for Closest String and Substring Problems , 2008, SIAM J. Comput..

[10]  Robert C. Edgar,et al.  Multiple sequence alignment. , 2006, Current opinion in structural biology.

[11]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[12]  Bin Ma,et al.  On the closest string and substring problems , 2002, JACM.

[13]  Thomas Lengauer,et al.  Bioinformatics ‐ From Genomes to Drugs , 2001 .

[14]  Kun-Mao Chao,et al.  Efficient Algorithms for Some Variants of the Farthest String Problem , 2006 .

[15]  Holger Mauch,et al.  Genetic algorithm approach for the closest string problem , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[16]  Panos M. Pardalos,et al.  MODELING AND SOLVING STRING SELECTION PROBLEMS , 2005 .

[17]  Ming Li,et al.  Some string problems in computational biology , 2000 .

[18]  Jing-Chao Chen,et al.  Iterative Rounding for the Closest String Problem , 2007, ArXiv.

[19]  Ami Litman,et al.  On covering problems of codes , 1997, Theory of Computing Systems.

[20]  Thomas Lengauer,et al.  Bioinformatics ‐ From Genomes to Drugs , 2001 .

[21]  Bin Ma,et al.  Distinguishing string selection problems , 2003, SODA '99.

[22]  Venkatesan Guruswami,et al.  The complexity of the covering radius problem , 2004, Proceedings. 19th IEEE Annual Conference on Computational Complexity, 2004..

[23]  Bryant A. Julstrom,et al.  A data-based coding of candidate strings in the closest string problem , 2009, GECCO '09.

[24]  Panos M. Pardalos,et al.  Optimal Solutions for the Closest-String Problem via Integer Programming , 2004, INFORMS J. Comput..

[25]  Rolf Niedermeier,et al.  On Exact and Approximation Algorithms for Distinguishing Substring Selection , 2003, FCT.

[26]  Pablo Moscato,et al.  On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts : Towards Memetic Algorithms , 1989 .