Computing nonbinary agreement forests

Given two rooted phylogenetic trees, the Maximum Agreement Forest problem (MAF) asks to find a forest that is, in a certain sense, common to both trees and has a minimum number of components. There has been considerable interest in a special case of this problem, in which the input trees are required to be binary. This binary version of the problem has been shown to be 3-approximable in linear time and to be solvable exactly in O(2.42^k n) time, with k the number of components of a maximum agreement forest and n the number of leaves. The general, nonbinary problem is known to have a (d+1)-approximation for trees with outdegree at most d, and a kernel of size 64k. Here we show that nonbinary MAF has a polynomial-time 4-approximation and an (exact) fixed-parameter tractable algorithm that runs in O(4^k poly(n)) time. The algorithms have been implemented and are publicly available.

[1]  V. Moulton,et al.  Bounding the Number of Hybridisation Events for a Consistent Evolutionary History , 2005, Journal of mathematical biology.

[2]  Maria Luisa Bonet,et al.  Approximating Subtree Distances Between Phylogenies , 2006, J. Comput. Biol..

[3]  Yoshiko Wakabayashi,et al.  The maximum agreement forest problem: Approximation algorithms and computational experiments , 2007, Theor. Comput. Sci..

[4]  Joshua Collins Rekernelisation Algorithms in Hybrid Phylogenies , 2009 .

[5]  Jörg Flum,et al.  Parameterized Complexity Theory , 2006, Texts in Theoretical Computer Science. An EATCS Series.

[6]  Daniel H. Huson,et al.  Phylogenetic Networks - Concepts, Algorithms and Applications , 2011 .

[7]  Zhi-Zhong Chen,et al.  HybridNET: a tool for constructing hybridization networks , 2010, Bioinform..

[8]  Steven Kelk,et al.  MAF: Maximum Agreement Forests for nonbinary trees , 2012 .

[9]  Charles Semple,et al.  A 3-approximation algorithm for the subtree distance between phylogenies , 2008, J. Discrete Algorithms.

[10]  Leo van Iersel,et al.  Cycle Killer...Qu'est-ce que c'est? On the Comparative Approximability of Hybridization Number and Directed Feedback Vertex Set , 2011, SIAM J. Discret. Math..

[11]  Charles Semple,et al.  A Framework for Representing Reticulate Evolution , 2005 .

[12]  Joseph Naor,et al.  Approximating Minimum Feedback Sets and Multicuts in Directed Graphs , 1998, Algorithmica.

[13]  Simone Linz,et al.  Quantifying Hybridization in Realistic Time , 2011, J. Comput. Biol..

[14]  Tao Jiang,et al.  On the Complexity of Comparing Evolutionary Trees , 1996, Discret. Appl. Math..

[15]  Frédéric Chataigner Approximating the Maximum Agreement Forest on k trees , 2005, Inf. Process. Lett..

[16]  Steven Kelk,et al.  A Simple Fixed Parameter Tractable Algorithm for Computing the Hybridization Number of Two (Not Necessarily Binary) Trees , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Leo van Iersel,et al.  A Practical Approximation Algorithm for Solving Massive Instances of Hybridization Number , 2012, WABI.

[18]  Charles Semple,et al.  Computing the minimum number of hybridization events for a consistent evolutionary history , 2007, Discret. Appl. Math..

[19]  Simone Linz,et al.  A Reduction Algorithm for Computing The Hybridization Number of Two Trees , 2007, Evolutionary bioinformatics online.

[20]  Charles Semple,et al.  On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance , 2005 .

[21]  Simone Linz,et al.  Hybridization in Non-Binary Trees , 2008 .

[22]  Simone Linz,et al.  A First Step Toward Computing All Hybridization Networks For Two Rooted Binary Phylogenetic Trees , 2011, J. Comput. Biol..