A Reduction Algorithm for Computing The Hybridization Number of Two Trees

Hybridization is an important evolutionary process for many groups of species. Thus, conflicting signals in a data set may not be the result of sampling or modeling errors, but due to the fact that hybridization has played a significant role in the evolutionary history of the species under consideration. Assuming that the initial set of gene trees is correct, a basic problem for biologists is to compute this minimum number of hybridization events to explain this set. In this paper, we describe a new reduction-based algorithm for computing the minimum number, when the initial data set consists of two trees. Although the two-tree problem is NP-hard, our algorithm always gives the exact solution and runs efficiently on many real biological problems. Previous algorithms for the two-tree problem either solve a restricted version of the problem or give an answer with no guarantee of the closeness to the exact solution. We illustrate our algorithm on a grass data set. This new algorithm is freely available for application at either http://www.bi.uni-duesseldorf.de/~linz or http://www.math.canterbury.ac.nz/~cas83.

[1]  Dan Gusfield,et al.  A Fundamental Decomposition Theory for Phylogenetic Networks and Incompatible Characters , 2005, RECOMB.

[2]  Jerrold I. Davis,et al.  Phylogeny and subfamilial classification of the grasses (Poaceae) , 2001 .

[3]  C. V. Morton,et al.  Flora of New Zealand, Volume I, Indigenous Tracheophyta (Psilopsida, Lycopsida, Filicopsida, Gymnospermae, Dicotyledones) , 1967 .

[4]  Daniel H. Huson,et al.  Reconstruction of Reticulate Networks from Gene Trees , 2005, RECOMB.

[5]  Charles Semple,et al.  Computing the Hybridization Number of Two Phylogenetic Trees Is Fixed-Parameter Tractable , 2007, IEEE ACM Trans. Comput. Biol. Bioinform..

[6]  L. Rieseberg,et al.  Major Ecological Transitions in Wild Sunflowers Facilitated by Hybridization , 2003, Science.

[7]  M. Baroni,et al.  Hybrid phylogenies: A graph-based approach to represent reticulate evolution , 2004 .

[8]  L. Rieseberg,et al.  Distribution of spontaneous plant hybrids. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[9]  E. Morris,et al.  Remarks , 2001 .

[10]  Hideo Matsuda,et al.  fastDNAmL: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood , 1994, Comput. Appl. Biosci..

[11]  Charles Semple,et al.  Hybrids in real time. , 2006, Systematic biology.

[12]  M. Bordewich,et al.  Computing the Hybridization Number of Two Phylogenetic Trees Is Fixed-Parameter Tractable , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Charles Semple,et al.  Computing the minimum number of hybridization events for a consistent evolutionary history , 2007, Discret. Appl. Math..

[14]  Michael T. Hallett,et al.  Efficient algorithms for lateral gene transfer problems , 2001, RECOMB.

[15]  J. Mallet Hybridization as an invasion of the genome. , 2005, Trends in ecology & evolution.

[16]  Heiko A. Schmidt,et al.  Phylogenetic trees from large datasets , 2003 .

[17]  Tandy J. Warnow,et al.  Reconstructing reticulate evolution in species: theory and practice , 2004, RECOMB.

[18]  V. Moulton,et al.  Bounding the Number of Hybridisation Events for a Consistent Evolutionary History , 2005, Journal of mathematical biology.

[19]  Luay Nakhleh,et al.  RIATA-HGT: A Fast and Accurate Heuristic for Reconstructing Horizontal Gene Transfer , 2005, COCOON.