An improved heuristic for the far from most strings problem

The Far From Most Strings Problem (FFMSP) asks for a string that is far from as many as possible of a given set of strings. All the input and the output strings are of the same length, and two strings are far if their Hamming distance is greater than or equal to a given threshold. FFMSP belongs to the class of sequence consensus problems which have applications in molecular biology, amongst others. FFMSP is NP-hard. It does not admit a constant-ratio approximation either, unless P=NP. In the last few years, heuristic and metaheuristic algorithms have been proposed for the problem, which use local search and require a heuristic, also called an evaluation function, to evaluate candidate solutions during local search. The heuristic function used, for this purpose, in these algorithms is the problem’s objective function. However, since many candidate solutions can be of the same objective value, the resulting search landscape includes many points which correspond to local maxima. In this paper, we devise a new heuristic function to evaluate candidate solutions. We then incorporate the proposed heuristic function within a Greedy Randomized Adaptive Search Procedure (GRASP), a metaheuristic originally proposed for the problem by Festa. The resulting algorithm outperforms state-of-the-art with respect to solution quality, in some cases by orders of magnitude, on both random and real data in our experiments. The results indicate that the number of local optima is considerably reduced using the proposed heuristic.

[1]  Christian Blum,et al.  Metaheuristics in combinatorial optimization: Overview and conceptual comparison , 2003, CSUR.

[2]  John S. Thompson,et al.  MIMO capacity improvement in the presence of antenna mutual coupling , 2010, 2010 18th Iranian Conference on Electrical Engineering.

[3]  Bin Ma,et al.  Finding similar regions in many strings , 1999, STOC '99.

[4]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[5]  Rolf Niedermeier,et al.  Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems , 2003, Algorithmica.

[6]  Bin Ma,et al.  More Efficient Algorithms for Closest String and Substring Problems , 2008, SIAM J. Comput..

[7]  Daniel A. Ashlock,et al.  Evolutionary computation for modeling and optimization , 2005 .

[8]  Jianer Chen,et al.  An improved lower bound on approximation algorithms for the Closest Substring problem , 2008, Inf. Process. Lett..

[9]  Mauricio G. C. Resende,et al.  Grasp: An Annotated Bibliography , 2002 .

[10]  David B. Fogel,et al.  Evolutionary Computation: Towards a New Philosophy of Machine Intelligence , 1995 .

[11]  Howard A. Peelle Euclid, Fibonacci, and Pascal--Recursed!. , 1975 .

[12]  Ming Li,et al.  Some string problems in computational biology , 2000 .

[13]  A. Macario,et al.  Gene Probes for Bacteria , 1990 .

[14]  A. W. F. Edwards Pascal's arithmetical triangle : the story of a mathematical idea , 2002 .

[15]  Panos M. Pardalos,et al.  A parallel multistart algorithm for the closest string problem , 2008, Comput. Oper. Res..

[16]  P. Pardalos,et al.  Optimization techniques for string selection and comparison problems in genomics , 2005, IEEE Engineering in Medicine and Biology Magazine.

[17]  Bin Ma,et al.  On the closest string and substring problems , 2002, JACM.

[18]  Rolf Niedermeier,et al.  On Exact and Approximation Algorithms for Distinguishing Substring Selection , 2003, FCT.

[19]  M. Resende,et al.  A probabilistic heuristic for a computationally difficult set covering problem , 1989 .

[20]  Mauricio G. C. Resende,et al.  An Annotated Bibliography of Grasp Part I: Algorithms , 2022 .

[21]  Jing-Chao Chen,et al.  Iterative Rounding for the Closest String Problem , 2007, ArXiv.

[22]  Gérard D. Cohen,et al.  Covering radius - Survey and recent results , 1985, IEEE Trans. Inf. Theory.

[23]  Bin Ma,et al.  Distinguishing string selection problems , 2003, SODA '99.

[24]  Rolf Niedermeier,et al.  Closest Strings, Primer Design, and Motif Search , 2010 .

[25]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[26]  Panos M. Pardalos,et al.  Optimal Solutions for the Closest-String Problem via Integer Programming , 2004, INFORMS J. Comput..

[27]  David B. Fogel,et al.  Evolutionary Computation: Toward a New Philosophy of Machine Intelligence (IEEE Press Series on Computational Intelligence) , 2006 .

[28]  Celso C. Ribeiro,et al.  Greedy Randomized Adaptive Search Procedures , 2003, Handbook of Metaheuristics.

[29]  Seyed Rasoul Mousavi,et al.  A Memetic Algorithm for closest string problem and farthest string problem , 2010, 2010 18th Iranian Conference on Electrical Engineering.

[30]  P Festa,et al.  On some optimization problems in molecular biology. , 2007, Mathematical biosciences.

[31]  Stanley T. Crooke,et al.  Antisense Research and Applications , 1993 .

[32]  Panos M. Pardalos,et al.  MODELING AND SOLVING STRING SELECTION PROBLEMS , 2005 .

[33]  Ugur Dogrusoz,et al.  Combinatorial Pattern Matching: 15th Annual Symposium, CPM 2004, Istanbul, Turkey, July 5-7, 2004, Proceedings (Lecture Notes in Computer Science) , 2004 .