ALGORITHMIC STRATEGIES FOR ESTIMATING THE AMOUNT OF RETICULATION FROM A COLLECTION OF GENE TREES

Phylogenetic networks have emerged as a unifying evolutionary model of both vertical and horizontal inheritance. A major approach for reconstructing such networks is to reconcile gene trees that are reconstructed from various genomic regions. The Subtree Prune and Regraft (SPR) operation has been used to obtain lower bound estimates of the number of reticulation events from a pair of trees. However, more than two trees are available in general and, to date, no work exists on estimating the amount of reticulation by the SPR operation from a collection, not only a pair, of trees. In this paper we address this problem, and propose two algorithmic strategies for heuristically solving it. The first is based on a simple, yet novel, observation on the binomial distribution of pairwise distances of trees inside a network. The second is based on the aggregation of solutions from pairwise computations. We have implemented both approaches and studied their performance in extensive simulations. The methods produce good results in general in terms of estimating the minimum number of reticulation events required to reconcile a set of trees. In addition, we identify conditions under which the methods do not work as well, in an attempt to help in the development of new methods in this area.

[1]  Bernard M. E. Moret,et al.  Network ( Reticulate ) Evolution : Biology , Models , and Algorithms , 2004 .

[2]  Eric Bapteste,et al.  Deduction of probable events of lateral gene transfer through comparison of phylogenetic trees by recursive consolidation and rearrangement , 2005, BMC Evolutionary Biology.

[3]  Yufeng Wu,et al.  A practical method for exact computation of subtree prune and regraft distance , 2009, Bioinform..

[4]  Jeffrey L. Thorne,et al.  Faculty Opinions recommendation of Stochastic models for horizontal gene transfer: taking a random walk through tree space. , 2005 .

[5]  M. Steel,et al.  Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees , 2001 .

[6]  Charles Semple,et al.  Note on the hybridization number and subtree distance in phylogenetics , 2009, Appl. Math. Lett..

[7]  Luay Nakhleh,et al.  RIATA-HGT: A Fast and Accurate Heuristic for Reconstructing Horizontal Gene Transfer , 2005, COCOON.

[8]  Luay Nakhleh,et al.  Phylogenetic networks , 2004 .

[9]  M. Suchard Stochastic Models for Horizontal Gene Transfer , 2005, Genetics.

[10]  W. Martin,et al.  The tree of one percent , 2006, Genome Biology.

[11]  Daniel H. Huson,et al.  Summarizing Multiple Gene Trees Using Cluster Networks , 2008, WABI.

[12]  Charles Semple,et al.  On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance , 2005 .

[13]  Luay Nakhleh,et al.  PhyloNet: a software package for analyzing and reconstructing reticulate evolutionary relationships , 2008, BMC Bioinformatics.

[14]  S Kullback,et al.  LETTER TO THE EDITOR: THE KULLBACK-LEIBLER DISTANCE , 1987 .

[15]  W. Maddison Gene Trees in Species Trees , 1997 .

[16]  D. Huson,et al.  Application of phylogenetic networks in evolutionary studies. , 2006, Molecular biology and evolution.

[17]  Luay Nakhleh,et al.  SPR-based Tree Reconciliation: Non-binary Trees and Multiple Solutions , 2008, APBC.

[18]  Pablo A. Goloboff,et al.  Calculating SPR distances between trees , 2008, Cladistics : the international journal of the Willi Hennig Society.

[19]  Mark A Ragan,et al.  Untangling hybrid phylogenetic signals: horizontal gene transfer and artifacts of phylogenetic reconstruction. , 2009, Methods in molecular biology.

[20]  Vladimir Makarenkov,et al.  Phylogenetic Network Construction Approaches , 2006 .

[21]  V. Moulton,et al.  Bounding the Number of Hybridisation Events for a Consistent Evolutionary History , 2005, Journal of mathematical biology.

[22]  Michael T. Hallett,et al.  Efficient algorithms for lateral gene transfer problems , 2001, RECOMB.

[23]  D. Morrison,et al.  Networks in phylogenetic analysis: new tools for population biology. , 2005, International journal for parasitology.

[24]  Luay Nakhleh,et al.  Confounding Factors in HGT Detection: Statistical Error, Coalescent Effects, and Multiple Solutions , 2007, J. Comput. Biol..

[25]  L. Nakhleh Evolutionary Phylogenetic Networks: Models and Issues , 2010 .

[26]  Nicholas Hamilton,et al.  Phylogenetic identification of lateral genetic transfer events , 2006, BMC Evolutionary Biology.

[27]  N. Galtier A model of horizontal gene transfer and the bacterial phylogeny problem. , 2007, Systematic biology.