Hybridization Number on Three Trees

Phylogenetic networks are leaf-labelled directed acyclic graphs that are used to describe nontreelike evolutionary histories and are thus a generalization of phylogenetic trees. The hybridization number of a phylogenetic network is the sum of all indegrees minus the number of nodes plus one. The Hybridization Number problem takes as input a collection of phylogenetic trees and asks to construct a phylogenetic network that contains an embedding of each of the input trees and has a smallest possible hybridization number. We present an algorithm for the Hybridization Number problem on three binary phylogenetic trees on n leaves, which runs in time O(c k poly(n)), with k the hybridization number of an optimal network and c some positive constant. For the case of two trees, an algorithm with running time O(3.18 k n) was proposed before whereas an algorithm with running time O(c k poly(n)) for more than two trees had prior to this article remained elusive. The algorithm for two trees uses the close connection to acyclic agreement forests to achieve a linear exponent in the running time, while previous algorithms for more than two trees (explicitly or implicitly) relied on a brute force search through all possible underlying network topologies, leading to running times that are not O(c k poly(n)), for any c. The connection to acyclic agreement forests is much weaker for more than two trees, so even given the right agreement forest, the reconstruction of the network poses major challenges. We prove novel structural results that allow us to reconstruct a network without having to guess the underlying topology. Our techniques generalize to more than three input trees with the exception of one key lemma that maps nodes in the network to tree nodes and, thus,

[1]  Yufeng Wu,et al.  An Algorithm for Constructing Parsimonious Hybridization Networks with Multiple Phylogenetic Trees , 2013, RECOMB.

[2]  Leo van Iersel,et al.  Approximation Algorithms for Nonbinary Agreement Forests , 2012, SIAM J. Discret. Math..

[3]  Luay Nakhleh,et al.  Parsimonious inference of hybridization in the presence of incomplete lineage sorting. , 2013, Systematic biology.

[4]  Norbert Zeh,et al.  Fixed-Parameter Algorithms for Maximum Agreement Forests , 2011, SIAM J. Comput..

[5]  Daniel H. Huson,et al.  Phylogenetic Networks - Concepts, Algorithms and Applications , 2011 .

[6]  Michael R. Fellows,et al.  Parameterized Complexity , 1998 .

[7]  Leo van Iersel,et al.  On the Elusiveness of Clusters , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Charles Semple,et al.  A Framework for Representing Reticulate Evolution , 2005 .

[9]  Charles Semple,et al.  Computing the minimum number of hybridization events for a consistent evolutionary history , 2007, Discret. Appl. Math..

[10]  Daniel H. Huson,et al.  Phylogenetic Networks: Introduction to phylogenetic networks , 2010 .

[11]  Olivier Gascuel,et al.  Reconstructing evolution : new mathematical and computational advances , 2007 .

[12]  Leo van Iersel,et al.  Kernelizations for the hybridization number problem on multiple nonbinary trees , 2013, J. Comput. Syst. Sci..

[13]  M. Bordewich,et al.  Computing the Hybridization Number of Two Phylogenetic Trees Is Fixed-Parameter Tractable , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[14]  Zhi-Zhong Chen,et al.  An Ultrafast Tool for Minimum Reticulate Networks , 2013, J. Comput. Biol..

[15]  Simone Linz,et al.  Hybridization in Non-Binary Trees , 2008 .

[16]  Steven Kelk,et al.  Towards the fixed parameter tractability of constructing minimal phylogenetic networks from arbitrary sets of nonbinary trees , 2012, ArXiv.

[17]  Steven Kelk,et al.  Networks: expanding evolutionary thinking. , 2013, Trends in genetics : TIG.

[18]  Leo van Iersel,et al.  Cycle Killer...Qu'est-ce que c'est? On the Comparative Approximability of Hybridization Number and Directed Feedback Vertex Set , 2011, SIAM J. Discret. Math..

[19]  Steven Kelk,et al.  Constructing Minimal Phylogenetic Networks from Softwired Clusters is Fixed Parameter Tractable , 2012, Algorithmica.

[20]  Leo van Iersel,et al.  A quadratic kernel for computing the hybridization number of multiple trees , 2012, Inf. Process. Lett..

[21]  V. Moulton,et al.  Bounding the Number of Hybridisation Events for a Consistent Evolutionary History , 2005, Journal of mathematical biology.

[22]  Charles Semple,et al.  A supertree method for rooted trees , 2000, Discret. Appl. Math..

[23]  Steven Kelk,et al.  A Simple Fixed Parameter Tractable Algorithm for Computing the Hybridization Number of Two (Not Necessarily Binary) Trees , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.