Finding Common Subsequences with Arcs and Pseudoknots

This paper examines the complexity of comparing sequences that have arcs linking symbol pairs. Such arc-annotated sequences can represent molecular sequences with bonds between bases, such as RNA sequences. Crossing arcs that can represent sequence pseudoknots are included. The problem of finding the longest common subsequence, on which pairwise sequence comparison algorithms are frequently based, is modified to require common subsequences to preserve the arcs induced by the selected symbol positions. This problem is then analyzed using classical and parameterized complexity. It is shown to be NP-complete, and alsoW[1]-complete when parameterized by desired length of common subsequence. If it is parameterized instead by arc cutwidth k, however, it becomes fixed-parameter tractable, and usable for sequences with arc structures of limited cutwidth. An algorithm is given that runs in time ∈ O(9knm).