Sparse Dynamic Programming for Evolutionary-Tree Comparison

Constructing evolutionary trees for species sets is a fundamental problem in biology. Unfortunately, there is no single agreed upon method for this task, and many methods are in use. Current practice dictates that trees be constructed using different methods and that the resulting trees should be compared for consensus. It has become necessary to automate this process as the number of species under consideration has grown. We study one formalization of the problem: the maximum agreement-subtree $($\MAST$)$ problem. The $\MAST$ problem is as follows: given a set $A$ and two rooted trees $\cT_0$ and $\cT_1$ leaf-labeled by the elements of $A$, find a maximum-cardinality subset $B$ of $A$ such that the topological restrictions of $\cT_0$ and $\cT_1$ to $B$ are isomorphic. In this paper, we will show that this problem reduces to unary weighted bipartite matching ($\UWBM$) with an $O(n^{1+o(1)})$ additive overhead. We also show that $\UWBM$ reduces linearly to $\MAST$. Thus our algorithm is optimal unless $\UWBM$ can be solved in near linear time. The overall running time of our algorithm is $O(n^{1.5} \log n)$, improving on the previous best algorithm, which runs in $O(n^2)$. We also derive an $O(n c^{\sqrt{\log n}})$-time algorithm for the case of bounded degrees, whereas the previously best algorithm runs in $O(n^2),$ as in the unbounded case.

[1]  J. Felsenstein Phylogenies from molecular sequences: inference and reliability. , 1988, Annual review of genetics.

[2]  A. D. Gordon,et al.  Obtaining common pruned trees , 1985 .

[3]  Tandy J. Warnow,et al.  Kaikoura Tree Theorems: Computing the Maximum Agreement Subtree , 1993, Inf. Process. Lett..

[4]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[5]  Eugene L. Lawler,et al.  Determining the evolutionary tree , 1990, SODA '90.

[6]  H. Wareham On the computational complexity of inferring evolutionary trees , 1992 .

[7]  William H. E. Day,et al.  Foreword: Comparison and consensus of classifications , 1986 .

[8]  Rakesh M. Verma General Techniques for Analyzing Recursive Algorithms with Applications , 1997, SIAM J. Comput..

[9]  F. McMorris,et al.  An algorithm to find agreement subtrees , 1995 .

[10]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[11]  David Fernández-Baca,et al.  A Polynomial-Time Algorithm for the Perfect Phylogeny Problem when the Number of Character States is Fixed , 1993, FOCS.

[12]  J. Farris Estimating Phylogenetic Trees from Distance Matrices , 1972, The American Naturalist.

[13]  Robert E. Tarjan,et al.  Faster Scaling Algorithms for Network Problems , 1989, SIAM J. Comput..

[14]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[15]  G. Olsen,et al.  Earliest phylogenetic branchings: comparing rRNA-based evolutionary trees inferred with various techniques. , 1987, Cold Spring Harbor symposia on quantitative biology.

[16]  Moon-Jung Chung,et al.  O(n^(2.55)) Time Algorithms for the Subgraph Homeomorphism Problem on Trees , 1987, J. Algorithms.

[17]  Dana S. Richards,et al.  Steiner tree problems , 1992, Networks.