Retractions of Finite Distance Functions Onto Tree Metrics

Trees with positively weighted edges induce a natural metric on any subset of vertices, however not every metric is representable in this way. A problem arising in areas of classification, particularly in evolutionary biology, is how to approximate an arbitrary distance function by such a tree metric, and thereby estimate the underlying tree that generated the data. Such transformations, from distances to tree metrics (and thereby to edge-weighted trees) should have some basic properties such as continuity, but this is lacking in several popular methods, for example (as we show) in “neighbor joining.” However, a continuous transformation, due to Buneman, frequently leads to uninteresting trees. We show how Buneman's construction can be refined so as to lead to more informative trees without sacrificing continuity, and we provide two simple examples of its use. We also provide a sufficient condition for both the Buneman construction, and its refinement to correctly recover the underlying tree.

[1]  Kevin Atteson,et al.  The Performance of Neighbor-Joining Algorithms of Phylogeny Recronstruction , 1997, COCOON.

[2]  D. Robinson,et al.  Comparison of weighted labelled trees , 1979 .

[3]  Vincent Moulton,et al.  A polynomial time algorithm for constructing the refined Buneman tree , 1999 .

[4]  Hans-Jürgen Bandelt Recognition of Tree Metrics , 1990, SIAM J. Discret. Math..

[5]  Robin Sibson,et al.  The Construction of Hierarchic and Non-Hierarchic Classifications , 1968, Comput. J..

[6]  A. Dress,et al.  Reconstructing the shape of a tree from observed dissimilarity data , 1986 .

[7]  M. Steel,et al.  Recovering evolutionary trees under a more realistic model of sequence evolution. , 1994, Molecular biology and evolution.

[8]  A. Haefliger,et al.  Group theory from a geometrical viewpoint , 1991 .

[9]  Daniel H. Huson,et al.  Analyzing and Visualizing Sequence and Distance Data Using SplitsTree , 1996, Discret. Appl. Math..

[10]  A. Ndreas,et al.  T -theory : An Overview , 1996 .

[11]  P. Buneman The Recovery of Trees from Measures of Dissimilarity , 1971 .

[12]  Ye.A Smolenskii A method for the linear recording of graphs , 1963 .

[13]  Dan Gusfield,et al.  Efficient algorithms for inferring evolutionary trees , 1991, Networks.

[14]  D. Kendall,et al.  Mathematics in the Archaeological and Historical Sciences , 1971, The Mathematical Gazette.

[15]  Sampath Kannan,et al.  Efficient algorithms for inverting evolution , 1999, JACM.

[16]  A. Dress,et al.  Split decomposition: a new and useful approach to phylogenetic analysis of distance data. , 1992, Molecular phylogenetics and evolution.

[17]  A. Dress,et al.  A canonical decomposition theory for metrics on a finite set , 1992 .