Approximate labelled subtree homeomorphism

Given two undirected trees T and P, the Subtree Homeomorphism Problem is to find whether T has a subtree t that can be transformed into P by removing entire subtrees, as well as repeatedly removing a degree-2 node and adding the edge joining its two neighbors. In this paper we extend the Subtree Homeomorphism Problem to a new optimization problem by enriching the subtree-comparison with node-to-node similarity scores. The new problem, called Approximate Labelled Subtree Homeomorphism (ALSH), is to compute the homeomorphic subtree of T which also maximizes the overall node-to-node resemblance. We describe an O(m^2n/logm+mnlogn) algorithm for solving ALSH on unordered, unrooted trees, where m and n are the number of vertices in P and T, respectively. We also give an O(mn) algorithm for rooted ordered trees and O(mnlogm) and O(mn) algorithms for unrooted cyclically ordered and unrooted linearly ordered trees, respectively.

[1]  Andrew V. Goldberg,et al.  Sublinear-Time Parallel Algorithms for Matching and Related Problems , 1993, J. Algorithms.

[2]  Robert E. Tarjan,et al.  Fibonacci heaps and their uses in improved network optimization algorithms , 1984, JACM.

[3]  Richard M. Karp,et al.  Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems , 1972, Combinatorial Optimization.

[4]  Falk Schreiber,et al.  Comparison of Metabolic Pathways using Constraint Graph Drawing , 2003, APBC.

[5]  Ron Y. Pinter,et al.  Approximate Labelled Subtree Homeomorphism , 2004, CPM.

[6]  Robert E. Tarjan,et al.  Data structures and network algorithms , 1983, CBMS-NSF regional conference series in applied mathematics.

[7]  M. Kanehisa,et al.  A heuristic graph comparison algorithm and its application to detect functionally related enzyme clusters. , 2000, Nucleic acids research.

[8]  Rajeev Motwani,et al.  Clique partitions, graph compression and speeding-up algorithms , 1991, STOC '91.

[9]  David Carmel,et al.  Searching XML documents via XML fragments , 2003, SIGIR.

[10]  D. Matula Subtree Isomorphism in O(n5/2) , 1978 .

[11]  D. R. Fulkerson,et al.  Flows in Networks. , 1964 .

[12]  Jeanette P. Schmidt,et al.  All Highest Scoring Paths in Weighted Grid Graphs and Their Application to Finding All Approximate Repeats in Strings , 1998, SIAM J. Comput..

[13]  Gabriel Valiente Constrained tree inclusion , 2005, J. Discrete Algorithms.

[14]  Marek Karpinski,et al.  Subtree Isomorphism is NC Reducible to Bipartite Perfect Matching , 1989, Inf. Process. Lett..

[15]  George B. Dantzig,et al.  7* A Primal-Dual Algorithm for Linear Programs , 1957 .

[16]  Ron Shamir,et al.  Faster subtree isomorphism , 1997, Proceedings of the Fifth Israeli Symposium on Theory of Computing and Systems.

[17]  Ron Y. Pinter,et al.  Alignment of metabolic pathways , 2005, Bioinform..

[18]  Heikki Mannila,et al.  Ordered and Unordered Tree Inclusion , 1995, SIAM J. Comput..

[19]  Daniel S. Hirschberg,et al.  A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[20]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[21]  M. Maes,et al.  On a Cyclic String-To-String Correction Problem , 1990, Inf. Process. Lett..

[22]  Ravindra K. Ahuja,et al.  New scaling algorithms for the assignment and minimum cycle mean problems , 1988 .

[23]  Moon-Jung Chung,et al.  O(n^(2.55)) Time Algorithms for the Subgraph Homeomorphism Problem on Trees , 1987, J. Algorithms.

[24]  Ron Y. Pinter,et al.  A new tool for the alignment of metabolic pathways , 2004 .

[25]  David K. Smith Network Flows: Theory, Algorithms, and Applications , 1994 .

[26]  Ming-Yang Kao,et al.  Cavity Matchings, Label Compressions, and Unrooted Evolutionary Trees , 2000, SIAM J. Comput..

[27]  Steven W. Reyner,et al.  An Analysis of a Good Algorithm for the Subtree Problem , 1977, SIAM J. Comput..

[28]  Gad M. Landau,et al.  An Extension of the Vector Space Model for Querying XML Documents via XML Fragments 1 , 2002 .

[29]  Horst Bunke,et al.  Optimal quadratic-time isomorphism of ordered graphs , 1999, Pattern Recognit..

[30]  Felix Naumann,et al.  Approximate tree embedding for querying XML data , 2000 .

[31]  Ravindra K. Ahuja,et al.  New scaling algorithms for the assignment and minimum mean cycle problems , 1992, Math. Program..

[32]  Phillip B. Gibbons,et al.  Subtree Isomorphism is in Random NC , 1988 .

[33]  A. Tucker,et al.  Linear Inequalities And Related Systems , 1956 .

[34]  Robert E. Tarjan,et al.  Faster Scaling Algorithms for Network Problems , 1989, SIAM J. Comput..

[35]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[36]  Tandy J. Warnow,et al.  Kaikoura Tree Theorems: Computing the Maximum Agreement Subtree , 1993, Inf. Process. Lett..