Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees

Abstract. Leaf-labelled trees are widely used to describe evolutionary relationships, particularly in biology. In this setting, extant species label the leaves of the tree, while the internal vertices correspond to ancestral species. Various techniques exist for reconstructing these evolutionary trees from data, and an important problem is to determine how "far apart" two such reconstructed trees are from each other, or indeed from the true historical tree. To investigate this question requires tree metrics, and these can be induced by operations that rearrange trees locally. Here we investigate three such operations: nearest neighbour interchange (NNI), subtree prune and regraft (SPR), and tree bisection and reconnection (TBR). The SPR operation is of particular interest as it can be used to model biological processes such as horizontal gene transfer and recombination. We count the number of unrooted binary trees one SPR from any given unrooted binary tree, as well as providing new upper and lower bounds for the diameter of the adjacency graph of trees under SPR and TBR. We also show that the problem of computing the minimum number of TBR operations required to transform one tree to another can be reduced to a problem whose size is a function just of the distance between the trees (and not of the size of the two trees), and thereby establish that the problem is fixed-parameter tractable.

[1]  D. Robinson Comparison of labeled trees with valency three , 1971 .

[2]  G. Moore,et al.  An iterative approach from the standpoint of the additive hypothesis to the dendrogram problem posed by molecular data sets. , 1973, Journal of theoretical biology.

[3]  Temple F. Smith,et al.  On the similarity of dendrograms. , 1978, Journal of theoretical biology.

[4]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[5]  J. P. Jarvis,et al.  Counterexamples in measuring the distance between binary trees , 1983 .

[6]  J. P. Jarvis,et al.  Comments on computing the similarity of binary trees , 1983 .

[7]  J. Hein Reconstructing evolution of sequences subject to recombination using parsimony. , 1990, Mathematical biosciences.

[8]  Wen-Hsiung Li,et al.  Fundamentals of molecular evolution , 1990 .

[9]  D. Maddison The discovery and importance of multiple islands of most , 1991 .

[10]  Roderic D. M. Page,et al.  On islands of trees and the efficacy of different methods of branch swapping in finding most-parsimonious trees , 1993 .

[11]  G. McFadden,et al.  Something borrowed, something green: lateral transfer of chloroplasts by secondary endosymbiosis. , 1995, Trends in ecology & evolution.

[12]  M. L.,et al.  On the Nearest Neighbour Interchange Distance Between Evolutionary Trees , 1996 .

[13]  Tao Jiang,et al.  On the Complexity of Comparing Evolutionary Trees , 1996, Discret. Appl. Math..

[14]  Liming Cai,et al.  Advice Classes of Parameterized Tractability , 1997, Ann. Pure Appl. Log..

[15]  Michael R. Fellows,et al.  Parameterized complexity: A framework for systematically confronting computational intractability , 1997, Contemporary Trends in Discrete Mathematics.

[16]  Michael R. Fellows,et al.  An Improved Fixed-Parameter Algorithm for Vertex Cover , 1998, Inf. Process. Lett..

[17]  Michael R. Fellows,et al.  Parameterized Complexity , 1998 .