A parallel multistart algorithm for the closest string problem

In this paper we describe and implement a parallel algorithm to find approximate solutions for the Closest String Problem (CSP). The CSP, also known as Motif Finding problem, has applications in Coding Theory and Computational Biology. The CSP is NP-hard which motivates us to think about heuristics to solve large instances. Several approximation algorithms have been designed for the CSP, but all of them have a poor performance guarantee. Recently some researchers have shown empirically that integer programming techniques can be successfully used to solve moderate-size instances (10-30 strings each of which is 300-800 characters long) of the CSP. However, real-world instances are larger than those tested. In this paper we show how a simple heuristic can be used to find near-optimal solutions to that problem. We implemented a parallel version of this heuristic and report computational experiments on large-scale instances. These results show the effectiveness of our approach.