An integrated complexity analysis of problems from computational biology
暂无分享,去创建一个
We perform an integrated complexity analysis on a number of combinatorial problems arising from the field of computational biology. The classic framework of ${\cal N}{\cal P}$-completeness, algorithmic design techniques for bounded width graphs, and parameterized computational complexity together provide a clear and detailed map of the intrinsic hardness of the following problems: I scNTERVALIZING C scOLORED G sc RAPHS and S scHORTEST C scOMMON S scUPERSEQUENCE.
The fundamental concern of parameterized complexity is the apparent qualitative difference in algorithmic behaviour displayed by many problems when one or more input parameters are bounded. For many problems, only a small range of values for these parameters capture most instances arising in practice. This is certainly the case in computational biology in several specific arenas such as DNA physical mapping or multiple sequence alignment. At its most general level, parameterized complexity partitions problems into two classes: fixed parameter tractable (FPT) and fixed parameter intractable (hard for classes of the W-hierarchy.) The former indicates that the particular parameterization may allow for efficient practical algorithms whilst the latter indicates the parameterization is not effective (asymptotically) in alleviating the intractability.
The problem I scNTERVALIZING C scOLORED G scRAPHS (ICG) models in a straightforward albeit limited way the determination of contig assemblies in the mapping of DNA. We show ICG to be ${\cal N}{\cal P}$-complete (no polynomial time algorithm unless ${\cal P}={\cal N}{\cal P}),$ not finite-state (a very general algorithmic design technique for bounded width graphs fails), and hard for the parameterized complexity class $W\lbrack1\rbrack$ (a specific parameterized version of ICG does not admit an efficient algorithm unless many other well-known--and apparently hard--problems admit efficient algorithms).
Both S scHORTEST C scOMMON S scUPERSEQUENCE and its sister problem L scONGEST C scOMMON S scUBSEQUENCE have applications in multiple sequence alignment. We show that S scHORTEST C scOMMON S scUPERSEQUENCE PARAMETERIZED BY THE NUMBER OF INPUT STRINGS AND THE SIZE OF THE ALPHABET is hard for complexity class $W\lbrack1\rbrack.$ As is the case with ICG, this implies that it does not admit efficient algorithms unless some unlikely computational complexity collapses occur.