Maximum Motif Problem in Vertex-Colored Graphs

Searching for motifs in graphs has become a crucial problem in the analysis of biological networks. In this context, different graph motif problems have been considered [13,7,5]. Pursuing a line of research pioneered by Lacroix et al. [13], we introduce in this paper a new graph motif problem: given a vertex colored graph G and a motif $\mathcal{M}$, where a motif is a multiset of colors, find a maximum cardinality submotif $\mathcal{M}' \subseteq \mathcal{M}$ that occurs as a connected motif in G . We prove that the problem is APX-hard even in the case where the target graph is a tree of maximum degree 3, the motif is actually a set and each color occurs at most twice in the tree. Next, we strengthen this result by proving that the problem is not approximable within factor $2^{\rm {log^{\delta} n}}$, for any constant *** < 1, unless NP *** DTIMEclass(2POLY log n). We complement these results by presenting two fixed-parameter algorithms for the problem, where the parameter is the size of the solution. Finally, we give exact fast exponential-time algorithms for the problem.

[1]  Tao Jiang,et al.  On the Approximation of Shortest Common Supersequences and Longest Common Subsequences , 1994, SIAM J. Comput..

[2]  Mihalis Yannakakis,et al.  Optimization, approximation, and complexity classes , 1991, STOC '88.

[3]  Mam Riess Jones Color Coding , 1962, Human factors.

[4]  Rolf Niedermeier,et al.  Invitation to Fixed-Parameter Algorithms , 2006 .

[5]  Roded Sharan,et al.  Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data , 2004, J. Comput. Biol..

[6]  Reinhard Diestel,et al.  Graph Theory , 1997 .

[7]  K. Brown,et al.  Graduate Texts in Mathematics , 1982 .

[8]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[9]  Roded Sharan,et al.  Topology-Free Querying of Protein Interaction Networks , 2009, RECOMB.

[10]  Cristina G. Fernandes,et al.  Motif Search in Graphs: Application to Metabolic Networks , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  David R. Karger,et al.  On approximating the longest path in a graph , 1997, Algorithmica.

[12]  Wojciech Szpankowski,et al.  Pairwise Local Alignment of Protein Interaction Networks Guided by Models of Evolution , 2005, RECOMB.

[13]  Roded Sharan,et al.  Identification of Protein Complexes by Comparative Analysis of Yeast and Bacterial Protein Interaction Data , 2005, J. Comput. Biol..

[14]  Tao Jiang,et al.  On the Complexity of Comparing Evolutionary Trees , 1996, Discret. Appl. Math..

[15]  Riccardo Dondi,et al.  Weak pattern matching in colored graphs: Minimizing the number of connected components , 2007, ICTCS.

[16]  Tao Jiang,et al.  On the Approximation of Shortest Common Supersequences and Longest Common Subsequences , 1995, SIAM J. Comput..

[17]  Michael R. Fellows,et al.  Parameterized Complexity , 1998 .

[18]  R. Karp,et al.  Conserved pathways within bacteria and yeast as revealed by global protein network alignment , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Michael R. Fellows,et al.  Sharp Tractability Borderlines for Finding Connected Motifs in Vertex-Colored Graphs , 2007, ICALP.

[20]  Roded Sharan,et al.  Efficient Algorithms for Detecting Signaling Pathways in Protein Interaction Networks , 2006, J. Comput. Biol..

[21]  Christian Komusiewicz,et al.  Parameterized Algorithms and Hardness Results for Some Graph Motif Problems , 2008, CPM.

[22]  Ming-Deh A. Huang,et al.  Proof of proposition 2 , 1992 .

[23]  Robin Milner,et al.  On Observing Nondeterminism and Concurrency , 1980, ICALP.