Extracting constrained 2-interval subsets in 2-interval sets

2-interval sets were used in [S. Vialette, Pattern matching over 2-intervals sets, in: Proc. 13th Annual Symposium Combinatorial Pattern Matching, CPM 2002, in: Lecture Notes in Computer Science, vol. 2373, Springer-Verlag, 2002, pp. 53-63; S. Vialette, On the computational complexity of 2-interval pattern matching, Theoret. Comput. Sci. 312 (2-3) (2004) 223-249] for establishing a general representation for macroscopic describers of RNA secondary structures. In this context, we have a 2-interval for each legal local fold in a given RNA sequence, and a constrained pattern made of disjoint 2-intervals represents a putative RNA secondary structure. We focus here on the problem of extracting a constrained pattern in a set of 2-intervals. More precisely, given a set of 2-intervals and a model R describing if two disjoint 2-intervals in a solution can be in precedence order (<), be allowed to nest () and/or be allowed to cross (), we consider the problem of finding a maximum cardinality subset of disjoint 2-intervals such that any two 2-intervals in agree with R. The different combinations of restrictions on model R alter the computational complexity of the problem, and need to be examined separately. In this paper, we improve the time complexity of [S. Vialette, On the computational complexity of 2-interval pattern matching, Theoret. Comput. Sci. 312 (2-3) (2004) 223-249] for model R={} by giving an optimal O(nlogn) time algorithm, where n is the cardinality of the 2-interval set . We also give a graph-like relaxation for model R={,} that is solvable in time. Finally, we prove that the considered problem is NP-complete for model R={<,} even for same-length intervals, and give a fixed-parameter tractability result based on the crossing structure of .

[1]  Dror Rawitz,et al.  Local ratio: A unified framework for approximation algorithms. In Memoriam: Shimon Even 1935-2004 , 2004, CSUR.

[2]  R. Möhring Algorithmic graph theory and perfect graphs , 1986 .

[3]  David R. Gilbert,et al.  Pattern Matching and Pattern Discovery Algorithms for Protein Topologies , 2001, WABI.

[4]  D. R. Fulkerson,et al.  Incidence matrices and interval graphs , 1965 .

[5]  Stéphane Vialette Pattern Matching Problems over 2-Interval Sets , 2002, CPM.

[6]  João Meidanis,et al.  Determining DNA Sequence Similarity Using Maximum Independent Set Algorithms for Interval Graphs , 1992, SWAT.

[7]  Vijay V. Vazirani,et al.  A theory of alternating paths and blossoms for proving correctness of the $$O(\sqrt V E)$$ general graph maximum matching algorithm , 1990, Comb..

[8]  Bin Ma,et al.  The Longest Common Subsequence Problem for Arc-Annotated Sequences , 2000, CPM.

[9]  B. Peyton,et al.  An Introduction to Chordal Graphs and Clique Trees , 1993 .

[10]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[11]  Michael L. Fredman,et al.  On computing the length of longest increasing subsequences , 1975, Discret. Math..

[12]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[13]  Rolf Niedermeier,et al.  Pattern matching for arc-annotated sequences , 2006, TALG.

[14]  Moshe Lewenstein,et al.  Optimization problems in multiple-interval graphs , 2007, SODA '07.

[15]  Robert E. Tarjan,et al.  Simple Linear-Time Algorithms to Test Chordality of Graphs, Test Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs , 1984, SIAM J. Comput..

[16]  Patricia A. Evans Finding Common Subsequences with Arcs and Pseudoknots , 1999, CPM.

[17]  Reuven Bar-Yehuda,et al.  Scheduling split intervals , 2002, SODA '02.

[18]  Stefan Felsner,et al.  Trapezoid Graphs and Generalizations, Geometry and Algorithms , 1994, Discret. Appl. Math..

[19]  Silvio Micali,et al.  An O(v|v| c |E|) algoithm for finding maximum matching in general graphs , 1980, 21st Annual Symposium on Foundations of Computer Science (sfcs 1980).

[20]  Blair J R S,et al.  Introduction to Chordal Graphs and Clique Trees, in Graph Theory and Sparse Matrix Computation , 1997 .

[21]  Stéphane Vialette,et al.  On the computational complexity of 2-interval pattern matching problems , 2004, Theor. Comput. Sci..

[22]  Ron Y. Pinter,et al.  Trapezoid graphs and their coloring , 1988, Discret. Appl. Math..

[23]  Rolf Niedermeier,et al.  Towards Optimally Solving the LONGEST COMMON SUBSEQUENCE Problem for Sequences with Nested Arc Annotations in Linear Time , 2002, CPM.

[24]  Stéphane Vialette Pattern Matching over 2-intervals sets , 2002 .

[25]  Frank Harary,et al.  On double and multiple interval graphs , 1979, J. Graph Theory.

[26]  Douglas B. West,et al.  Extremal Values of the Interval Number of a Graph , 1980, SIAM J. Matrix Anal. Appl..

[27]  Christos H. Papadimitriou,et al.  Algorithmic aspects of protein structure similarity , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[28]  Shuai Cheng Li,et al.  On the Complexity of the Crossing Contact Map Pattern Matching Problem , 2006, WABI.

[29]  M. Golumbic Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57) , 2004 .

[30]  David B. Shmoys,et al.  Recognizing graphs with fixed interval number is NP-complete , 1984, Discret. Appl. Math..

[31]  Guillaume Fertin,et al.  New Results for the 2-Interval Pattern Problem , 2004, CPM.

[32]  Gad M. Landau,et al.  Approximating the 2-interval pattern problem , 2005, Theor. Comput. Sci..