On The Parameterized Intractability Of Motif Search Problems*

We show that Closest Substring, one of the most important problems in the field of consensus string analysis, is W[1]-hard when parameterized by the number k of input strings (and remains so, even over a binary alphabet). This is done by giving a “strongly structure-preserving” reduction from the graph problem Clique to Closest Substring. This problem is therefore unlikely to be solvable in time O(f(k)•nc) for any function f of k and constant c independent of k, i.e., the combinatorial explosion seemingly inherent to this NP-hard problem cannot be restricted to parameter k. The problem can therefore be expected to be intractable, in any practical sense, for k ≥ 3. Our result supports the intuition that Closest Substring is computationally much harder than the special case of Closest String, althoughb othp roblems are NP-complete. We also prove W[1]-hardness for other parameterizations in the case of unbounded alphabet size. Our W[1]-hardness result for Closest Substring generalizes to Consensus Patterns, a problem arising in computational biology.

[1]  Michael R. Fellows,et al.  Parameterized Complexity: The Main Ideas and Some Research Frontiers , 2009, ISAAC.

[2]  Rolf Niedermeier,et al.  Invitation to Fixed-Parameter Algorithms , 2006 .

[3]  Bin Ma,et al.  Finding similar regions in many strings , 1999, STOC '99.

[4]  Svatopluk Poljak,et al.  On the complexity of the subgraph problem , 1985 .

[5]  Pavel A. Pevzner,et al.  Computational molecular biology : an algorithmic approach , 2000 .

[6]  Bin Ma,et al.  Genetic Design of Drugs Without Side-Effects , 2003, SIAM J. Comput..

[7]  Rolf Niedermeier,et al.  Parameterized Intractability of Distinguishing Substring Selection , 2006, Theory of Computing Systems.

[8]  Rolf Niedermeier,et al.  Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems , 2003, Algorithmica.

[9]  Rolf Niedermeier,et al.  Upper Bounds for Vertex Cover Further Improved , 1999, STACS.

[10]  Todd Wareham,et al.  On the complexity of finding common approximate substrings , 2003, Theor. Comput. Sci..

[11]  Bin Ma,et al.  On the closest string and substring problems , 2002, JACM.

[12]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[13]  Dániel Marx,et al.  The closest substring problem with small distances , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[14]  Luca Trevisan,et al.  On the Efficiency of Polynomial Time Approximation Schemes , 1997, Inf. Process. Lett..

[15]  Rolf Niedermeier,et al.  On Exact and Approximation Algorithms for Distinguishing Substring Selection , 2003, FCT.

[16]  Michael R. Fellows,et al.  Parameterized Complexity , 1998 .

[17]  Rolf Niedermeier,et al.  Faster exact algorithms for hard problems: A parameterized point of view , 2001, Discret. Math..

[18]  Pavel A. Pevzner,et al.  Combinatorial Approaches to Finding Subtle Signals in DNA Sequences , 2000, ISMB.

[19]  Michael R. Fellows,et al.  Blow-Ups, Win/Win's, and Crown Rules: Some New Directions in FPT , 2003, WG.

[20]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[21]  Michael R. Fellows,et al.  New Directions and New Challenges in Algorithm Design and Complexity, Parameterized , 2003, WADS.

[22]  Bin Ma,et al.  Finding Similar Regions in Many Sequences , 2002, J. Comput. Syst. Sci..

[23]  Bin Ma,et al.  A Polynominal Time Approximation Scheme for the Closest Substring Problem , 2000, CPM.

[24]  Marie-France Sagot,et al.  Spelling Approximate Repeated or Common Motifs Using a Suffix Tree , 1998, LATIN.

[25]  Michael R. Fellows,et al.  Parameterized complexity analysis in computational biology , 1995, Comput. Appl. Biosci..

[26]  Andrzej Lingas,et al.  Efficient approximation algorithms for the Hamming center problem , 1999, SODA '99.

[27]  Michael R. Fellows,et al.  Parameterized Complexity: The Main Ideas and Connections to Practical Computing , 2000, Experimental Algorithmics.

[28]  Bin Ma,et al.  Distinguishing string selection problems , 2003, SODA '99.

[29]  Michael R. Fellows,et al.  The Parameterized Complexity of Sequence Alignment and Consensus , 1994, CPM.

[30]  Michael R. Fellows,et al.  Fixed-Parameter Tractability and Completeness II: On Completeness for W[1] , 1995, Theor. Comput. Sci..

[31]  Mathieu Blanchette,et al.  Algorithms for phylogenetic footprinting , 2001, RECOMB.

[32]  Jeremy Buhler,et al.  Finding motifs using random projections , 2001, RECOMB.

[33]  Michael Trevor Hallett An integrated complexity analysis of problems from computational biology , 1998 .

[34]  Krzysztof Pietrzak,et al.  On the parameterized complexity of the fixed alphabet shortest common supersequence and longest common subsequence problems , 2003, J. Comput. Syst. Sci..

[35]  Weijia Jia,et al.  Vertex Cover: Further Observations and Further Improvements , 2001, J. Algorithms.

[36]  Weijia Jia,et al.  Vertex Cover: Further Observations and Further Improvements , 1999, J. Algorithms.

[37]  A. Litman,et al.  On covering problems of codes , 1997, Theory of Computing Systems.

[38]  Rolf Niedermeier,et al.  Ubiquitous Parameterization - Invitation to Fixed-Parameter Algorithms , 2004, MFCS.

[39]  Rodney G. Downey,et al.  Parameterized complexity for the skeptic , 2003, 18th IEEE Annual Conference on Computational Complexity, 2003. Proceedings..

[40]  Rolf Niedermeier,et al.  Exact Solutions for CLOSEST STRING and Related Problems , 2001, ISAAC.