Nodal distances for rooted phylogenetic trees

Dissimilarity measures for (possibly weighted) phylogenetic trees based on the comparison of their vectors of path lengths between pairs of taxa, have been present in the systematics literature since the early seventies. For rooted phylogenetic trees, however, these vectors can only separate non-weighted binary trees, and therefore these dissimilarity measures are metrics only on this class of rooted phylogenetic trees. In this paper we overcome this problem, by splitting in a suitable way each path length between two taxa into two lengths. We prove that the resulting splitted path lengths matrices single out arbitrary rooted phylogenetic trees with nested taxa and arcs weighted in the set of positive real numbers. This allows the definition of metrics on this general class of rooted phylogenetic trees by comparing these matrices through metrics in spaces $${\mathcal{M}_n(\mathbb {R})}$$ of real-valued n × n matrices. We conclude this paper by establishing some basic facts about the metrics for non-weighted phylogenetic trees defined in this way using Lp metrics on $${\mathcal{M}_n(\mathbb {R})}$$, with $${p \in \mathbb {R}_{ >0 }}$$.

[1]  M. Steel,et al.  Distributions of Tree Comparison Metrics—Some New Results , 1993 .

[2]  Dong-Guk Shin,et al.  Nodal distance algorithm: calculating a phylogenetic tree comparison metric , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[3]  D. Robinson,et al.  Comparison of weighted labelled trees , 1979 .

[4]  N. Oden,et al.  An algorithm to equiprobably generate all directed trees with kappa labeled terminal nodes and unlabeled interior nodes. , 1984, Bulletin of mathematical biology.

[5]  Roderic D. M. Page,et al.  Phyloinformatics: Toward a Phylogenetic Database , 2005, Data Mining in Bioinformatics.

[6]  J. M. S. S. Pereira,et al.  A note on the tree realizability of a distance matrix , 1969 .

[7]  D. Penny,et al.  The Use of Tree Comparison Metrics , 1985 .

[8]  Antonis Rokas,et al.  Genomics and the Tree of Life , 2006, Science.

[9]  Temple F. Smith,et al.  On the similarity of dendrograms. , 1978, Journal of theoretical biology.

[10]  Charles Semple,et al.  Encoding phylogenetic trees in terms of weighted quartets , 2008, Journal of mathematical biology.

[11]  Ye.A Smolenskii A method for the linear recording of graphs , 1963 .

[12]  P. Buneman The Recovery of Trees from Measures of Dissimilarity , 1971 .

[13]  Douglas E. Critchlow,et al.  THE TRIPLES DISTANCE FOR ROOTED BIFURCATING PHYLOGENETIC TREES , 1996 .

[14]  Hervé Abdi,et al.  Additive-Tree Representations , 1990 .

[15]  Louis J. Billera,et al.  Geometry of the Space of Phylogenetic Trees , 2001, Adv. Appl. Math..

[16]  V. Morell The Roots of Phylogeny , 1996 .

[17]  Vladimir Batagelj,et al.  An algorithm for tree-realizability of distance matrices , 1990, Int. J. Comput. Math..

[18]  N. J. A. Sloane,et al.  The On-Line Encyclopedia of Integer Sequences , 2003, Electron. J. Comb..

[19]  Kerstin Hoef-Emden,et al.  Molecular phylogenetic analyses and real-life data , 2005, Comput. Sci. Eng..

[20]  J. Farris On Comparing the Shapes of Taxonomic Trees , 1973 .

[21]  W. T. Williams,et al.  ON THE COMPARISON OF TWO CLASSIFICATIONS OF THE SAME SET OF ELEMENTS , 1971 .

[22]  James O. McInerney,et al.  TOPD/FMTS: a new software to compare phylogenetic trees , 2007, Bioinform..

[23]  J. Farris A Successive Approximations Approach to Character Weighting , 1969 .

[24]  Joseph Felsenstein,et al.  The number of evolutionary trees , 1978 .

[25]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[26]  F. T. Boesch,et al.  Properties of the distance matrix of a tree , 1969 .

[27]  Douglas B. Kell,et al.  Computational cluster validation in post-genomic data analysis , 2005, Bioinform..

[28]  Antonio Galves,et al.  DETECTING PHYLOGENETIC RELATIONS OUT FROM SPARSE CONTEXT TREES , 2008 .

[29]  M. Steel,et al.  Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees , 2001 .

[30]  Prashant Batra,et al.  Newton's method and the Computational Complexity of the Fundamental Theorem of Algebra , 2008, CCA.

[31]  HandlJulia,et al.  Computational cluster validation in post-genomic data analysis , 2005 .