On the hardness of inferring phylogenies from triplet-dissimilarities

This work considers the problem of reconstructing a phylogenetic tree from triplet-dissimilarities, which are dissimilarities defined over taxon-triplets. Triplet-dissimilarities are possibly the simplest generalization of pairwise dissimilarities, and were used for phylogenetic reconstructions in the past few years. We study the hardness of finding a tree best fitting a given triplet-dissimilarity table under the @?"~ norm. We show that the corresponding decision problem is NP-hard and that the corresponding optimization problem cannot be approximated in polynomial time within a constant multiplicative factor smaller than 1.4. On the positive side, we present a polynomial time constant-rate approximation algorithm for this problem. We also address the issue of best-fit under maximal distortion, which corresponds to the largest ratio between matching entries in two triplet-dissimilarity tables. We show that it is NP-hard to approximate the corresponding optimization problem within any constant multiplicative factor.

[1]  P. Buneman The Recovery of Trees from Measures of Dissimilarity , 1971 .

[2]  Tandy J. Warnow,et al.  A Few Logs Suffice to Build (almost) All Trees: Part II , 1999, Theor. Comput. Sci..

[3]  Tandy J. Warnow,et al.  A few logs suffice to build (almost) all trees (I) , 1999, Random Struct. Algorithms.

[4]  O. Gascuel,et al.  Improvement of distance-based phylogenetic methods by a local maximum likelihood approach using triplets. , 2002, Molecular biology and evolution.

[5]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[6]  A. Tversky,et al.  Additive similarity trees , 1977 .

[7]  W. A. Beyer,et al.  Additive evolutionary trees. , 1977, Journal of theoretical biology.

[8]  Lior Pachter,et al.  Beyond pairwise distances: neighbor-joining with phylogenetic diversity estimates. , 2006, Molecular biology and evolution.

[9]  Mirko Krvanek The Complexity of Ultrametric Partitions on Graphs , 1988, Inf. Process. Lett..

[10]  Sampath Kannan,et al.  A robust model for finding optimal evolutionary trees , 1993, Algorithmica.

[11]  Nathan Linial,et al.  Low dimensional embeddings of ultrametrics , 2004, Eur. J. Comb..

[12]  P. Erdös,et al.  A few logs suffice to build (almost) all trees (l): part I , 1997 .

[13]  Mikkel Thorup,et al.  On the approximability of numerical taxonomy (fitting distances by tree metrics) , 1996, SODA '96.

[14]  Shlomo Moran,et al.  Neighbor Joining Algorithms for Inferring Phylogenies via LCA Distances , 2007, J. Comput. Biol..

[15]  W. H. Day Computational complexity of inferring phylogenies from dissimilarity matrices. , 1987, Bulletin of mathematical biology.