A parsimony-based metric for phylogenetic trees

In evolutionary biology various metrics have been defined and studied for comparing phylogenetic trees. Such metrics are used, for example, to compare competing evolutionary hypotheses or to help organize algorithms that search for optimal trees. Here we introduce a new metric d p on the collection of binary phylogenetic trees each labeled by the same set of species. The metric is based on the so-called parsimony score, an important concept in phylogenetics that is commonly used to construct phylogenetic trees. Our main results include a characterization of the unit neighborhood of a tree in the d p metric, and an explicit formula for its diameter, that is, a formula for the maximum possible value of d p over all possible pairs of trees labeled by the same set of species. We also show that d p is closely related to the well-known tree bisection and reconnection (tbr) and subtree prune and regraft (spr) distances, a connection which will hopefully provide a useful new approach to understanding properties of these and related metrics.

[1]  Krzysztof Giaro,et al.  Matching Split Distance for Unrooted Binary Phylogenetic Trees , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  Charles Semple,et al.  On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance , 2005 .

[3]  L. Segal John , 2013, The Messianic Secret.

[4]  Laura Kubatko,et al.  Inference of Phylogenetic Trees , 2008 .

[5]  Yang Ding,et al.  On agreement forests , 2011, J. Comb. Theory, Ser. A.

[6]  William H. E. Day,et al.  Analysis of Quartet Dissimilarity Measures Between Undirected Phylogenetic Trees , 1986 .

[7]  Simon Whelan,et al.  The prevalence of multifurcations in tree-space and their implications for tree-search. , 2010, Molecular biology and evolution.

[8]  M. Steel,et al.  Distributions of Tree Comparison Metrics—Some New Results , 1993 .

[9]  M. Steel,et al.  Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees , 2001 .

[10]  P. Erdös,et al.  Evolutionary trees: An integer multicommodity max-flow-min-cut theorem , 1992 .

[11]  Taoyang Wu,et al.  On the Neighborhoods of Trees , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  Yu Lin,et al.  A Metric for Phylogenetic Trees Based on Matching , 2011, ISBRA.

[13]  Glenn Hickey,et al.  SPR Distance Computation for Unrooted Trees , 2008, Evolutionary bioinformatics online.

[14]  David Bryant,et al.  Parsimony via consensus. , 2007, Systematic biology.

[15]  Maria Luisa Bonet,et al.  Approximating Subtree Distances Between Phylogenies , 2006, J. Comput. Biol..

[16]  Gabriel Cardona,et al.  An algebraic metric for phylogenetic trees , 2009, Appl. Math. Lett..

[17]  Katherine St. John,et al.  Walks in phylogenetic treespace , 2011, Inf. Process. Lett..

[18]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[19]  Temple F. Smith,et al.  On the similarity of dendrograms. , 1978, Journal of theoretical biology.

[20]  D. Bryant The Splits in the Neighborhood of a Tree , 2004 .