Fixed-Parameter and Approximation Algorithms for Maximum Agreement Forests of Multifurcating Trees

We present efficient fixed-parameter and approximation algorithms for the NP-hard problem of computing a maximum agreement forest (MAF) of a pair of multifurcating (nonbinary) rooted trees. Multifurcating trees arise naturally as a result of statistical uncertainty in current tree construction methods. The size of an MAF corresponds to the subtree prune-and-regraft distance of the two trees and is intimately connected to their hybridization number. These distance measures are essential tools for understanding reticulate evolution, such as lateral gene transfer, recombination, and hybridization. Our algorithms nearly match the running times of the currently best algorithms for the binary case. This is achieved using a combination of efficient branching rules (similar to but more complex than in the binary case) and a novel edge protection scheme that further reduces the size of the search space the algorithms need to explore.

[1]  M. Steel,et al.  Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees , 2001 .

[2]  Michael T. Hallett,et al.  Efficient algorithms for lateral gene transfer problems , 2001, RECOMB.

[3]  W. Maddison RECONSTRUCTING CHARACTER EVOLUTION ON POLYTOMOUS CLADOGRAMS , 1989, Cladistics : the international journal of the Willi Hennig Society.

[4]  Leo van Iersel,et al.  Computing nonbinary agreement forests , 2012, ArXiv.

[5]  Tao Jiang,et al.  On the Complexity of Comparing Evolutionary Trees , 1996, Discret. Appl. Math..

[6]  Frédéric Chataigner Approximating the Maximum Agreement Forest on k trees , 2005, Inf. Process. Lett..

[7]  Charles Semple,et al.  On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance , 2005 .

[8]  Zhi-Zhong Chen,et al.  An Ultrafast Tool for Minimum Reticulate Networks , 2013, J. Comput. Biol..

[9]  Maria Luisa Bonet,et al.  Efficiently Calculating Evolutionary Tree Measures Using SAT , 2009, SAT.

[10]  Nicholas Hamilton,et al.  Phylogenetic identification of lateral genetic transfer events , 2006, BMC Evolutionary Biology.

[11]  Zhi-Zhong Chen,et al.  HybridNET: a tool for constructing hybridization networks , 2010, Bioinform..

[12]  Katherine St. John,et al.  Efficiently calculating evolutionary tree measures using SAT , 2009 .

[13]  Zhi-Zhong Chen,et al.  Faster exact computation of rSPR distance , 2015, J. Comb. Optim..

[14]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[15]  W. H. Day Optimal algorithms for comparing trees with labeled leaves , 1985 .

[16]  Eric Bapteste,et al.  Deduction of probable events of lateral gene transfer through comparison of phylogenetic trees by recursive consolidation and rearrangement , 2005, BMC Evolutionary Biology.

[17]  Glenn Hickey,et al.  SPR Distance Computation for Unrooted Trees , 2008, Evolutionary bioinformatics online.

[18]  Armando Tacchella,et al.  Theory and Applications of Satisfiability Testing , 2003, Lecture Notes in Computer Science.

[19]  Yoshiko Wakabayashi,et al.  The maximum agreement forest problem: Approximation algorithms and computational experiments , 2007, Theor. Comput. Sci..

[20]  Luay Nakhleh,et al.  RIATA-HGT: A Fast and Accurate Heuristic for Reconstructing Horizontal Gene Transfer , 2005, COCOON.

[21]  V. Moulton,et al.  Bounding the Number of Hybridisation Events for a Consistent Evolutionary History , 2005, Journal of mathematical biology.

[22]  Jiayin Wang,et al.  Fast Computation of the Exact Hybridization Number of Two Phylogenetic Trees , 2010, ISBRA.

[23]  N. Zeh,et al.  Supertrees Based on the Subtree Prune-and-Regraft Distance , 2014, Systematic biology.

[24]  Zhi-Zhong Chen,et al.  Algorithms for Reticulate Networks of Multiple Phylogenetic Trees , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[25]  Norbert Zeh,et al.  A Unifying View on Approximation and FPT of Agreement Forests , 2009, WABI.

[26]  W. Maddison Gene Trees in Species Trees , 1997 .

[27]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[28]  Norbert Zeh,et al.  Fixed-Parameter Algorithms for Maximum Agreement Forests , 2011, SIAM J. Comput..

[29]  Norbert Zeh,et al.  Fast FPT Algorithms for Computing Rooted Agreement Forests: Theory and Experiments , 2010, SEA.

[30]  Maria Luisa Bonet,et al.  Approximating Subtree Distances Between Phylogenies , 2006, J. Comput. Biol..

[31]  M. Bordewich,et al.  Computing the Hybridization Number of Two Phylogenetic Trees Is Fixed-Parameter Tractable , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[32]  Patrick Deschavanne,et al.  Horizontal transfer of a virulence operon to the ancestor of Mycobacterium tuberculosis. , 2006, Molecular biology and evolution.

[33]  Tandy J. Warnow,et al.  Reconstructing Reticulate Evolution in SpeciesTheory and Practice , 2005, J. Comput. Biol..

[34]  Charles Semple,et al.  A 3-approximation algorithm for the subtree distance between phylogenies , 2008, J. Discrete Algorithms.

[35]  Leo van Iersel,et al.  Cycle Killer...Qu'est-ce que c'est? On the Comparative Approximability of Hybridization Number and Directed Feedback Vertex Set , 2011, SIAM J. Discret. Math..

[36]  F. Ayala Molecular systematics , 2004, Journal of Molecular Evolution.

[37]  C. Semple,et al.  Hybridization in Nonbinary Trees , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[38]  Yufeng Wu,et al.  A practical method for exact computation of subtree prune and regraft distance , 2009, Bioinform..

[39]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[40]  Leo van Iersel,et al.  Approximation Algorithms for Nonbinary Agreement Forests , 2012, SIAM J. Discret. Math..

[41]  Frank Dehne,et al.  The Computational Complexity of the Unrooted Subtree Prune and Regraft Distance , 2006 .

[42]  Daniel H. Huson,et al.  Fast computation of minimum hybridization networks , 2012, Bioinform..

[43]  Catherine McCartin,et al.  A Faster FPT Algorithm for the Maximum Agreement Forest Problem , 2007, Theory of Computing Systems.

[44]  Charles Semple,et al.  Computing the minimum number of hybridization events for a consistent evolutionary history , 2007, Discret. Appl. Math..

[45]  Simone Linz,et al.  A Reduction Algorithm for Computing The Hybridization Number of Two Trees , 2007, Evolutionary bioinformatics online.

[46]  Katherine St. John,et al.  On the Complexity of uSPR Distance , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.