Distributions of Tree Comparison Metrics—Some New Results

Measures of dissimilarity (metrics) for comparing trees are important tools in the quantitative analysis of evolutionary trees, but many of their properties are incompletely known. The present paper reports formulae for the distributions of three classes of tree comparison metrics: the partition (or symmetric difference) metric, the quartet metric (which compares subsets of four taxa), and a metric based on path-length differences between pairs of taxa. The properties studied include the mean and variance for several underlying distributions of trees, the range, the effect of the number of taxa, and methods of calculation. Three basic theorems and their proofs are reported, one for each class of tree comparison metric

[1]  W. T. Williams,et al.  ON THE COMPARISON OF TWO CLASSIFICATIONS OF THE SAME SET OF ELEMENTS , 1971 .

[2]  Elizabeth A. Thompson,et al.  Human Evolutionary Trees , 1975 .

[3]  J. A. Bondy,et al.  Graph Theory with Applications , 1978 .

[4]  Temple F. Smith,et al.  On the similarity of dendrograms. , 1978, Journal of theoretical biology.

[5]  Joseph Felsenstein,et al.  The number of evolutionary trees , 1978 .

[6]  Bernard Monjardet,et al.  Metrics on partially ordered sets - A survey , 1981, Discret. Math..

[7]  L. Foulds,et al.  Testing the theory of evolution by comparing phylogenetic trees constructed from five different protein sequences , 1982, Nature.

[8]  N. Oden,et al.  An algorithm to equiprobably generate all directed trees with kappa labeled terminal nodes and unlabeled interior nodes. , 1984, Bulletin of mathematical biology.

[9]  David Penny,et al.  Comparing Trees with Pendant Vertices Labelled , 1984 .

[10]  D. Penny,et al.  The Use of Tree Comparison Metrics , 1985 .

[11]  William H. E. Day,et al.  Analysis of Quartet Dissimilarity Measures Between Undirected Phylogenetic Trees , 1986 .

[12]  A. Dress,et al.  Reconstructing the shape of a tree from observed dissimilarity data , 1986 .

[13]  Daniel Simberloff,et al.  CALCULATING PROBABILITIES THAT CLADOGRAMS MATCH: A METHOD OF BIOGEOGRAPHICAL INFERENCE , 1987 .

[14]  Mike A. Steel,et al.  Distribution of the Symmetric Difference Metric on Phylogenetic Trees , 1988, SIAM J. Discret. Math..

[15]  W. Kress,et al.  Genes and tongues. , 1989, Science.

[16]  D. Swofford When are phylogeny estimates from molecular and morphological data incongruent , 1991 .

[17]  G. Dueck New optimization heuristics , 1993 .