An Algorithm for Constructing Parsimonious Hybridization Networks with Multiple Phylogenetic Trees

A phylogenetic network is a model for reticulate evolution. A hybridization network is one type of phylogenetic network for a set of discordant gene trees and "displays" each gene tree. A central computational problem on hybridization networks is: given a set of gene trees, reconstruct the minimum (i.e., most parsimonious) hybridization network that displays each given gene tree. This problem is known to be NP-hard, and existing approaches for this problem are either heuristics or making simplifying assumptions (e.g., work with only two input trees or assume some topological properties). In this article, we develop an exact algorithm (called PIRNC) for inferring the minimum hybridization networks from multiple gene trees. The PIRNC algorithm does not rely on structural assumptions (e.g., the so-called galled networks). To the best of our knowledge, PIRNC is the first exact algorithm implemented for this formulation. When the number of reticulation events is relatively small (say, four or fewer), PIRNC runs reasonably efficient even for moderately large datasets. For building more complex networks, we also develop a heuristic version of PIRNC called PIRNCH. Simulation shows that PIRNCH usually produces networks with fewer reticulation events than those by an existing method. PIRNC and PIRNCH have been implemented as part of the software package called PIRN and is available online.

[1]  Charles Semple,et al.  On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance , 2005 .

[2]  Zhi-Zhong Chen,et al.  An Ultrafast Tool for Minimum Reticulate Networks , 2013, J. Comput. Biol..

[3]  Yufeng Wu,et al.  A practical method for exact computation of subtree prune and regraft distance , 2009, Bioinform..

[4]  Tao Jiang,et al.  On the Complexity of Comparing Evolutionary Trees , 1996, Discret. Appl. Math..

[5]  Luay Nakhleh,et al.  MURPAR: A Fast Heuristic for Inferring Parsimonious Phylogenetic Networks from Multiple Gene Trees , 2012, ISBRA.

[6]  Daniel H. Huson,et al.  Phylogenetic Networks - Concepts, Algorithms and Applications , 2011 .

[7]  Jiayin Wang,et al.  Fast Computation of the Exact Hybridization Number of Two Phylogenetic Trees , 2010, ISBRA.

[8]  Charles Semple,et al.  Computing the minimum number of hybridization events for a consistent evolutionary history , 2007, Discret. Appl. Math..

[9]  Simone Linz,et al.  A Reduction Algorithm for Computing The Hybridization Number of Two Trees , 2007, Evolutionary bioinformatics online.

[10]  Norbert Zeh,et al.  A Unifying View on Approximation and FPT of Agreement Forests , 2009, WABI.

[11]  L. Nakhleh Evolutionary Phylogenetic Networks: Models and Issues , 2010 .

[12]  Daniel H. Huson,et al.  Fast computation of minimum hybridization networks , 2012, Bioinform..

[13]  L. Stougie,et al.  Constructing Level-2 Phylogenetic Networks from Triplets , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[14]  Daniel H. Huson,et al.  Computing galled networks from real data , 2009, Bioinform..

[15]  Leo van Iersel,et al.  A quadratic kernel for computing the hybridization number of multiple trees , 2012, Inf. Process. Lett..

[16]  Dan Gusfield,et al.  Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained and structured recombination , 2005, J. Comput. Syst. Sci..

[17]  Daniel H. Huson,et al.  Phylogenetic Networks: Contents , 2010 .

[18]  Yufeng Wu,et al.  COALESCENT‐BASED SPECIES TREE INFERENCE FROM GENE TREE TOPOLOGIES UNDER INCOMPLETE LINEAGE SORTING BY MAXIMUM LIKELIHOOD , 2012, Evolution; international journal of organic evolution.

[19]  Yufeng Wu,et al.  Close lower and upper bounds for the minimum reticulate network of multiple phylogenetic trees , 2010, Bioinform..

[20]  Zhi-Zhong Chen,et al.  Algorithms for Reticulate Networks of Multiple Phylogenetic Trees , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Daniel H. Huson,et al.  Beyond Galled Trees - Decomposition and Computation of Galled Networks , 2007, RECOMB.

[22]  Charles Semple,et al.  HYBRIDIZATION NETWORKS , 2014 .